Module: Hyperion::Http::Sendfile
- Defined in:
- lib/hyperion/http/sendfile.rb,
ext/hyperion_http/sendfile.c
Overview
Sendfile — Ruby-side façade over the C-extension Hyperion::Http::Sendfile native helper. Handles the portable concerns the C ext deliberately leaves to userspace:
* Looping on :partial returns from the kernel (short writes).
* Yielding to the fiber scheduler / IO.select on :eagain.
* Falling back to IO.copy_stream when:
- native zero-copy isn't compiled (non-Linux, non-BSD/Darwin host),
- the kernel returned :unsupported (this fd pair can't sendfile),
- the destination IO is a TLS-wrapped socket (kernel can't encrypt).
The C ext defines ‘Hyperion::Http::Sendfile` as a module too — when the extension loads first it pre-creates the constant and we re-open it here to add the higher-level helpers. The native singleton methods (`copy`, `copy_small`, `copy_splice`, `supported?`, `splice_supported?`, `small_file_threshold`, `platform_tag`) survive the re-open untouched.
2.0.1 Phase 8 — close static-file rps gaps
The 2.0.0 BENCH report had two rows where Hyperion still lost Puma on rps:
* 8 KB static at -t 5 (-w 1) — 121 r/s vs Puma 1,246 r/s (10× loss)
* 1 MiB static at -t 5 (-w 1) — 1,809 r/s vs Puma 2,139 r/s (-15%)
Diagnosis (see ext/hyperion_http/sendfile.c header):
8 KB row: every request paid ~40 ms in EAGAIN-yield-retry cycles
because sendfile against an 8 KB file routinely hits EAGAIN once
before the kernel TCP send buffer accepts it; with -t 5 only 5
fibers can be in-flight, and 4 sleeping in EAGAIN-yield-retry
starves the wrk loop.
1 MiB row: sendfile(2) re-derives some bookkeeping per call that
splice(2) through a pipe-tee avoids.
Fixes:
8a. Small-file fast path. If file_size <= 64 KiB we use the new
`copy_small` C primitive: heap-buffered read + write under the
GVL released, EAGAIN polled with a short select() instead of
fiber-yielding. The transfer completes in microseconds rather
than dancing with the fiber scheduler.
8b. Linux splice path (2.0.1 / disabled / re-enabled in 2.2.0). For
files > 64 KiB on Linux we try `copy_splice` first (file_fd ->
fresh pipe -> sock_fd with SPLICE_F_MOVE | SPLICE_F_MORE).
Falls back to plain `copy` (sendfile) if the runtime kernel
returns :unsupported, if `splice_supported?` is false (non-
Linux builds), or if any SystemCallError surfaces from the
primitive.
2.2.0 — splice path re-enabled with fresh per-request pipe pair
2.0.1 disabled the splice route from copy_to_socket because the cached per-thread pipe pair leaked residual bytes between requests: if ‘splice(file -> pipe)` succeeded but `splice(pipe -> socket)` failed mid-transfer (peer closed), the unread bytes stayed in the pipe and went out on the NEXT connection’s socket. 2.2.0 fixes this at the lifecycle layer rather than abandoning the path —‘copy_splice` now opens a fresh `pipe2(O_CLOEXEC | O_NONBLOCK)` pair on every call and closes both fds on every exit path. Two extra syscalls per call vs the cached layout, but correctness is unconditional: a pipe never carries bytes for more than one transfer.
2.2.x fix-A — pipe-hoist out of the chunk loop
The 2026-04-30 bench sweep showed 2.2.0’s per-call pipe2 cost a -23% rps regression on the static 1 MiB row (1,697 → 1,312 r/s) because ‘native_copy_loop` invokes the splice primitive ONCE PER CHUNK in a `while remaining.positive?` loop. For a 1 MiB asset at 64 KiB chunks that’s 16 calls × 3 syscalls of pipe overhead = 48 wasted syscalls per request. Fix-A pushes the pipe lifecycle up one level: ‘native_copy_loop` now opens a single pipe2(O_CLOEXEC | O_NONBLOCK) pair per RESPONSE, hands it to the new `copy_splice_into_pipe` primitive for every chunk, and closes both fds in an ensure block when the response loop unwinds (success, EAGAIN-retry-loop exit, raised exception). Same correctness window as 2.2.0 — a pipe pair never outlives one response, so EPIPE mid-transfer cannot leak residual bytes onto the next request’s socket — at 1/16th the syscall cost.
Constant Summary collapse
- USERSPACE_CHUNK =
Maximum bytes per IO.copy_stream call on the userspace fallback, and per-call cap on the native sendfile / splice loops. 2.6-A bumped this from 64 KiB to 256 KiB.
64 KiB was the original “kernel TCP send buffer’s typical sweet spot” value — small enough to bound a single syscall’s GVL hold-time, large enough to amortize the syscall cost. 2.6-A measurements on openclaw-vm (Linux 6.x, 1 MiB warm-cache static asset) showed the kernel happily accepts 256 KiB per sendfile(2) / splice(2) call —the kernel TCP send buffer auto-tunes upward under sustained load, and modern NICs+TSO segment 256 KiB-1 MiB chunks at line rate. At 256 KiB we issue 4× fewer syscalls per 1 MiB response (4 calls vs 16) while keeping the GVL hold-time well under 1 ms even on a slow client.
Reference: nginx (‘sendfile_max_chunk` default 0 = unlimited, but most distros ship with `2m` overrides), Apache (`SendBufferSize` 128k–256k), Caddy (256 KiB hard-coded). Hyperion sits in the middle of that field.
256 * 1024
- SMALL_FILE_THRESHOLD =
2.0.1 Phase 8a small-file threshold. Files <= this size take the synchronous read+write path with no fiber-yield. Mirrors the C constant ‘HYP_SMALL_FILE_THRESHOLD` — kept in sync via the `small_file_threshold` introspection method on hosts where the native ext is loaded.
64 * 1024
- SPLICE_THRESHOLD =
2.2.0 — splice fires for files strictly larger than this many bytes. Below the threshold the small-file synchronous path (‘copy_small`) wins outright; between the small-file ceiling and this constant plain sendfile(2) is fast enough that the extra pipe2 + 2× close round-trip isn’t worth it. Set equal to SMALL_FILE_THRESHOLD so anything above the small-file path gets the splice attempt.
SMALL_FILE_THRESHOLD- FADVISE_THRESHOLD =
2.7-F — ‘posix_fadvise(fd, 0, len, POSIX_FADV_SEQUENTIAL)` fires ONCE per response when the streaming loop is engaged AND the response body is at least this large. Files smaller than the threshold hit the kernel in a single sendfile / splice round; the readahead hint is dead weight for them. At and above FADVISE_THRESHOLD the kernel will issue multiple chunks (the 2.6-A USERSPACE_CHUNK is 256 KiB), and pre-warming the page cache before the chunk loop starts avoids the second/third chunk waiting on disk I/O on cold-cache requests.
2.6-B regressed warm-cache by -6.6% because the same hint was called PER CHUNK in the C primitive (4× per 1 MiB response). 2.7-F hoists the call to the Ruby loop entry —once per response, regardless of how many chunks the kernel uses — making the warm-cache impact at most 1 extra syscall per response (≤1%). See CHANGELOG entry 2.7-F + ext/… /sendfile.c (rb_sendfile_fadvise_sequential).
256 * 1024
Class Method Summary collapse
-
.copy(out_io, in_io, rb_offset, rb_len) ⇒ Object
Sendfile.copy(out_io, in_io, offset, len).
-
.copy_small(out_io, in_io, rb_offset, rb_len) ⇒ Object
Sendfile.copy_small(out_io, in_io, offset, len) -> Integer.
-
.copy_splice(out_io, in_io, rb_offset, rb_len) ⇒ Object
Sendfile.copy_splice(out_io, in_io, offset, len) -> [bytes_written, status] Linux-only.
-
.copy_splice_into_pipe(out_io, in_io, rb_offset, rb_len, rb_pipe_r, rb_pipe_w) ⇒ Object
Sendfile.copy_splice_into_pipe(out_io, in_io, offset, len, pipe_r, pipe_w) -> [bytes_written, status].
-
.copy_to_socket(out_io, file_io, offset, len) ⇒ Object
High-level helper: copy ‘len` bytes from `file_io` (regular file) starting at `offset` into `out_io` (TCP socket or other writable IO).
-
.copy_to_socket_blocking(out_io, file_io, offset, len) ⇒ Object
2.6-C — Puma-style serial-per-thread sendfile loop.
- .fadvise_sequential(file_io, rb_len) ⇒ Object
-
.fast_path_kind(out_io) ⇒ Object
Returns true when the Ruby-side helper can take the fast path for ‘out_io`.
-
.mark_splice_unsupported! ⇒ Object
Called by native_copy_loop when copy_splice reports :unsupported at runtime (very old kernel without splice(2), sandboxed environment that blocks pipe2, etc.).
-
.platform_tag ⇒ Object
Sendfile.platform_tag — returns a small Symbol describing which kernel path got compiled in.
-
.small_file_threshold ⇒ Object
Sendfile.small_file_threshold — exposes the C constant to Ruby.
-
.splice_runtime_supported? ⇒ Boolean
2.2.0 — runtime probe for the splice path.
-
.splice_supported? ⇒ Boolean
Sendfile.splice_supported? — true on Linux builds where the splice branch was compiled in.
-
.supported? ⇒ Boolean
Sendfile.supported? — module-introspection helper.
Class Method Details
.copy(out_io, in_io, rb_offset, rb_len) ⇒ Object
Sendfile.copy(out_io, in_io, offset, len)
573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 |
# File 'ext/hyperion_http/sendfile.c', line 573
static VALUE rb_sendfile_copy(VALUE self, VALUE out_io, VALUE in_io,
VALUE rb_offset, VALUE rb_len) {
(void)self;
#if defined(HYP_SF_LINUX) || defined(HYP_SF_BSD)
long offset_l = NUM2LONG(rb_offset);
long len_l = NUM2LONG(rb_len);
if (offset_l < 0) {
rb_raise(rb_eArgError, "offset must be >= 0 (got %ld)", offset_l);
}
if (len_l < 0) {
rb_raise(rb_eArgError, "len must be >= 0 (got %ld)", len_l);
}
if (len_l == 0) {
return rb_ary_new3(2, INT2FIX(0), sym_done);
}
sendfile_args_t args;
args.out_fd = extract_fd(out_io, "out_io");
args.in_fd = extract_fd(in_io, "in_io");
args.offset = (off_t)offset_l;
args.len = (size_t)len_l;
args.rc = -1;
args.err = 0;
rb_thread_call_without_gvl(sendfile_blocking_call, &args, RUBY_UBF_IO, NULL);
if (args.rc < 0) {
if (args.err == EAGAIN || args.err == EWOULDBLOCK || args.err == EINTR) {
/* Kernel didn't accept any bytes; caller yields and retries. */
return rb_ary_new3(2, INT2FIX(0), sym_eagain);
}
if (args.err == ENOSYS || args.err == EINVAL || args.err == ENOTSUP
# ifdef EOPNOTSUPP
|| args.err == EOPNOTSUPP
# endif
) {
/* Kernel says "this combination of fds doesn't support sendfile"
* (e.g. socket on a tunfs that doesn't expose page cache, or
* Darwin trying to sendfile to a non-stream socket). Caller
* falls back to IO.copy_stream. */
return rb_ary_new3(2, INT2FIX(0), sym_unsupported);
}
# ifdef HYP_SF_BSD
/* On Darwin/BSD a partial transfer can also report errno; if any
* bytes flew, surface them with :partial so the caller can advance
* its cursor before re-erroring on the next iteration. */
if (args.rc > 0) {
return rb_ary_new3(2, LONG2NUM((long)args.rc),
sym_partial);
}
# endif
errno = args.err;
rb_sys_fail("sendfile");
}
if (args.rc == 0) {
/* Kernel accepted nothing AND didn't error. Treat as :eagain so
* the caller yields rather than spinning. (Some kernels behave
* this way under tight non-blocking pressure.) */
return rb_ary_new3(2, INT2FIX(0), sym_eagain);
}
if ((size_t)args.rc < args.len) {
return rb_ary_new3(2, LONG2NUM((long)args.rc), sym_partial);
}
return rb_ary_new3(2, LONG2NUM((long)args.rc), sym_done);
#else /* !Linux && !BSD */
(void)out_io; (void)in_io; (void)rb_offset; (void)rb_len;
rb_raise(rb_eNotImpError,
"Hyperion::Http::Sendfile.copy: native zero-copy unsupported on "
"this platform; fall back to IO.copy_stream");
return Qnil; /* unreachable */
#endif
}
|
.copy_small(out_io, in_io, rb_offset, rb_len) ⇒ Object
Sendfile.copy_small(out_io, in_io, offset, len) -> Integer
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 |
# File 'ext/hyperion_http/sendfile.c', line 363
static VALUE rb_sendfile_copy_small(VALUE self, VALUE out_io, VALUE in_io,
VALUE rb_offset, VALUE rb_len) {
(void)self;
long offset_l = NUM2LONG(rb_offset);
long len_l = NUM2LONG(rb_len);
if (offset_l < 0) {
rb_raise(rb_eArgError, "offset must be >= 0 (got %ld)", offset_l);
}
if (len_l < 0) {
rb_raise(rb_eArgError, "len must be >= 0 (got %ld)", len_l);
}
if (len_l == 0) {
return INT2FIX(0);
}
if (len_l > HYP_SMALL_FILE_THRESHOLD) {
rb_raise(rb_eArgError,
"Hyperion::Http::Sendfile.copy_small: len %ld exceeds "
"SMALL_FILE_THRESHOLD %d; use copy() for streaming",
len_l, HYP_SMALL_FILE_THRESHOLD);
}
small_copy_args_t args;
args.out_fd = extract_fd(out_io, "out_io");
args.in_fd = extract_fd(in_io, "in_io");
args.offset = (off_t)offset_l;
args.len = (size_t)len_l;
/* Heap-allocate a buffer of exactly the requested size. Bounded by
* 64 KiB, so this is a one-shot small alloc. We could pull from a
* per-thread arena to avoid malloc, but the bench shape (one alloc
* per request, freed before the next) is well within glibc's
* thread-local cache hot path. */
args.buf = (char *)malloc(args.len);
if (args.buf == NULL) {
rb_raise(rb_eNoMemError, "Hyperion::Http::Sendfile.copy_small: "
"failed to allocate %lu bytes",
(unsigned long)args.len);
}
rb_thread_call_without_gvl(small_copy_blocking, &args, RUBY_UBF_IO, NULL);
free(args.buf);
if (args.err != 0 && args.total == 0) {
errno = args.err;
rb_sys_fail("Hyperion::Http::Sendfile.copy_small");
}
/* Partial transfer (e.g. EAGAIN budget exhausted). Surface what we
* got; the caller can re-issue from cursor + total. The 8 KB row
* doesn't hit this in practice but we're defensive about it. */
return LONG2NUM((long)args.total);
}
|
.copy_splice(out_io, in_io, rb_offset, rb_len) ⇒ Object
Sendfile.copy_splice(out_io, in_io, offset, len) -> [bytes_written, status] Linux-only. 2.2.0 layout: opens a fresh pipe pair via pipe2(O_CLOEXEC | O_NONBLOCK) on every call and closes it on every exit path. No persistent state, no cross-request byte leak. Returns :unsupported on non-Linux hosts so the Ruby caller can fall back to copy().
657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 |
# File 'ext/hyperion_http/sendfile.c', line 657
static VALUE rb_sendfile_copy_splice(VALUE self, VALUE out_io, VALUE in_io,
VALUE rb_offset, VALUE rb_len) {
(void)self;
#ifdef HYP_SF_LINUX
long offset_l = NUM2LONG(rb_offset);
long len_l = NUM2LONG(rb_len);
if (offset_l < 0) {
rb_raise(rb_eArgError, "offset must be >= 0 (got %ld)", offset_l);
}
if (len_l < 0) {
rb_raise(rb_eArgError, "len must be >= 0 (got %ld)", len_l);
}
if (len_l == 0) {
return rb_ary_new3(2, INT2FIX(0), sym_done);
}
/* Fresh pipe pair for THIS call only. Opened here, closed on
* every exit path below. pipe2 is one syscall; the close pair
* is two more. The 3-syscall overhead is amortized against the
* splice copies (which stay zero-copy across file -> pipe ->
* socket) for files >= 64 KiB; the Ruby caller gates on size. */
int pipe_fds[2];
int prc = hyp_open_pipe_pair(pipe_fds);
if (prc != 0) {
/* pipe2 / pipe failed. ENOSYS / EMFILE / ENFILE — all map
* to "splice path can't run right now"; let the caller fall
* back to plain sendfile. */
return rb_ary_new3(2, INT2FIX(0), sym_unsupported);
}
splice_args_t args;
args.in_fd = extract_fd(in_io, "in_io");
args.out_fd = extract_fd(out_io, "out_io");
args.pipe_r = pipe_fds[0];
args.pipe_w = pipe_fds[1];
args.offset = (off_t)offset_l;
args.len = (size_t)len_l;
args.rc = 0;
args.err = 0;
rb_thread_call_without_gvl(splice_blocking_call, &args, RUBY_UBF_IO, NULL);
/* Close the pipe pair before we either return a value or
* raise. This is the whole point of the 2.2.0 fix: the pipe
* never outlives this call, so residual bytes from a partial
* transfer cannot leak onto the next request's socket. */
hyp_close_pipe_pair(pipe_fds);
if (args.rc > 0) {
if (args.err == EAGAIN || args.err == EWOULDBLOCK) {
return rb_ary_new3(2, LONG2NUM((long)args.rc), sym_partial);
}
if (args.err != 0) {
errno = args.err;
rb_sys_fail("splice");
}
if ((size_t)args.rc < args.len) {
return rb_ary_new3(2, LONG2NUM((long)args.rc), sym_partial);
}
return rb_ary_new3(2, LONG2NUM((long)args.rc), sym_done);
}
/* args.rc == 0. */
if (args.err == EAGAIN || args.err == EWOULDBLOCK || args.err == EINTR) {
return rb_ary_new3(2, INT2FIX(0), sym_eagain);
}
if (args.err == ENOSYS || args.err == EINVAL) {
return rb_ary_new3(2, INT2FIX(0), sym_unsupported);
}
if (args.err != 0) {
errno = args.err;
rb_sys_fail("splice");
}
return rb_ary_new3(2, INT2FIX(0), sym_done);
#else
(void)out_io; (void)in_io; (void)rb_offset; (void)rb_len;
return rb_ary_new3(2, INT2FIX(0), sym_unsupported);
#endif
}
|
.copy_splice_into_pipe(out_io, in_io, rb_offset, rb_len, rb_pipe_r, rb_pipe_w) ⇒ Object
Sendfile.copy_splice_into_pipe(out_io, in_io, offset, len, pipe_r, pipe_w)
-> [bytes_written, status]
2.2.x fix-A — pipe-hoisted splice primitive.
Splices file_fd → pipe_w → sock_fd for ONE chunk of a response. The pipe pair is supplied by the caller and is reused across every chunk of a single response; this function does NOT open or close the pipe. The Ruby façade (‘native_copy_loop`) is responsible for the pipe lifecycle (`open_splice_pipe!` at entry, `close` in an ensure block at exit). Same return shape as `copy_splice` — :done / :partial / :eagain / :unsupported.
Linux-only. Returns [0, :unsupported] on non-Linux hosts so the Ruby caller can fall back to plain sendfile. pipe_r / pipe_w may be Integer fds or IO objects (‘IO.pipe` returns the latter); we extract via the same helper used for in_io/out_io.
755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 |
# File 'ext/hyperion_http/sendfile.c', line 755
static VALUE rb_sendfile_copy_splice_into_pipe(VALUE self, VALUE out_io, VALUE in_io,
VALUE rb_offset, VALUE rb_len,
VALUE rb_pipe_r, VALUE rb_pipe_w) {
(void)self;
#ifdef HYP_SF_LINUX
long offset_l = NUM2LONG(rb_offset);
long len_l = NUM2LONG(rb_len);
if (offset_l < 0) {
rb_raise(rb_eArgError, "offset must be >= 0 (got %ld)", offset_l);
}
if (len_l < 0) {
rb_raise(rb_eArgError, "len must be >= 0 (got %ld)", len_l);
}
if (len_l == 0) {
return rb_ary_new3(2, INT2FIX(0), sym_done);
}
splice_args_t args;
args.in_fd = extract_fd(in_io, "in_io");
args.out_fd = extract_fd(out_io, "out_io");
args.pipe_r = extract_fd(rb_pipe_r, "pipe_r");
args.pipe_w = extract_fd(rb_pipe_w, "pipe_w");
args.offset = (off_t)offset_l;
args.len = (size_t)len_l;
args.rc = 0;
args.err = 0;
rb_thread_call_without_gvl(splice_blocking_call, &args, RUBY_UBF_IO, NULL);
if (args.rc > 0) {
if (args.err == EAGAIN || args.err == EWOULDBLOCK) {
return rb_ary_new3(2, LONG2NUM((long)args.rc), sym_partial);
}
if (args.err != 0) {
errno = args.err;
rb_sys_fail("splice");
}
if ((size_t)args.rc < args.len) {
return rb_ary_new3(2, LONG2NUM((long)args.rc), sym_partial);
}
return rb_ary_new3(2, LONG2NUM((long)args.rc), sym_done);
}
/* args.rc == 0. */
if (args.err == EAGAIN || args.err == EWOULDBLOCK || args.err == EINTR) {
return rb_ary_new3(2, INT2FIX(0), sym_eagain);
}
if (args.err == ENOSYS || args.err == EINVAL) {
return rb_ary_new3(2, INT2FIX(0), sym_unsupported);
}
if (args.err != 0) {
errno = args.err;
rb_sys_fail("splice");
}
return rb_ary_new3(2, INT2FIX(0), sym_done);
#else
(void)out_io; (void)in_io; (void)rb_offset; (void)rb_len;
(void)rb_pipe_r; (void)rb_pipe_w;
return rb_ary_new3(2, INT2FIX(0), sym_unsupported);
#endif
}
|
.copy_to_socket(out_io, file_io, offset, len) ⇒ Object
High-level helper: copy ‘len` bytes from `file_io` (regular file) starting at `offset` into `out_io` (TCP socket or other writable IO). Loops on partial writes; yields on EAGAIN.
Returns the total number of bytes written. Raises Errno::* on real socket errors (EPIPE, ECONNRESET, …) — same shape as a raw ‘socket.write` call. The caller’s existing rescue handlers (slow- client cleanup, metrics, body#close) keep working unchanged.
229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 |
# File 'lib/hyperion/http/sendfile.rb', line 229 def copy_to_socket(out_io, file_io, offset, len) return 0 if len.zero? kind = fast_path_kind(out_io) # Phase 8a: small-file synchronous fast path. Only fires on the # native branch (we need a real socket fd to issue write(2) # against) AND when the source side is also a real fd (pread(2) # against an Integer fd). The C ext is only loaded on native # builds. This MUST come BEFORE the :native streaming branch — # it's the whole point of Phase 8a: skip the fiber-yield # round-trip for the 8 KB row. if kind == :native && len <= SMALL_FILE_THRESHOLD && respond_to?(:copy_small) && real_fd?(file_io) return copy_small(out_io, file_io, offset, len) end case kind when :native native_copy_loop(out_io, file_io, offset, len) when :userspace, :tls_userspace userspace_copy_loop(out_io, file_io, offset, len) end end |
.copy_to_socket_blocking(out_io, file_io, offset, len) ⇒ Object
2.6-C — Puma-style serial-per-thread sendfile loop. Same zero-copy mechanics as ‘copy_to_socket` but with EAGAIN handled by `IO.select(nil, [out], nil, 5.0)` instead of `wait_writable` (fiber yield). Under the GVL the OS thread parks on the select; no per-chunk fiber-scheduler hop.
Engaged from ‘ResponseWriter#write_sendfile` when the per-response `dispatch_mode` is `:inline_blocking` — auto- detected for `body.respond_to?(:to_path)` static-file routes in `Adapter::Rack#call`, or set explicitly by the app via `env = :inline_blocking`.
Userspace + TLS-userspace branches reuse ‘userspace_copy_loop` — `IO.copy_stream` is already blocking on the calling thread, no fiber-yield refactor needed there. Small-file (<= 64 KiB) native path also stays through `copy_small`: that primitive already handles EAGAIN with a short select() under the GVL, so the small-file fast path is “blocking” in the relevant sense regardless of `:inline_blocking` opt-in.
273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 |
# File 'lib/hyperion/http/sendfile.rb', line 273 def copy_to_socket_blocking(out_io, file_io, offset, len) return 0 if len.zero? # 2.6-D — defensive `Fiber.blocking` wrap so direct callers # (specs, future code paths) get the no-yield guarantee # even if they didn't already wrap us themselves. When # the calling fiber is already blocking (the fast path: # `ResponseWriter#write_sendfile` wraps the whole sendfile # path in `Fiber.blocking` for `:inline_blocking`) the # nested wrap is a no-op — `Fiber.blocking` short-circuits # if the current fiber's blocking flag is already set. if ::Fiber.current.blocking? copy_to_socket_blocking_inner(out_io, file_io, offset, len) else ::Fiber.blocking { copy_to_socket_blocking_inner(out_io, file_io, offset, len) } end end |
.fadvise_sequential(file_io, rb_len) ⇒ Object
848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 |
# File 'ext/hyperion_http/sendfile.c', line 848
static VALUE rb_sendfile_fadvise_sequential(VALUE self, VALUE file_io, VALUE rb_len) {
(void)self;
#ifdef HYP_SF_LINUX
long len_l = NUM2LONG(rb_len);
if (len_l <= 0) {
/* Nothing to advise on; the Ruby caller normally gates on a
* threshold but defend against zero/negative anyway. */
return sym_noop;
}
int fd = extract_fd(file_io, "file_io");
/* posix_fadvise on Linux returns 0 on success or a positive errno
* on failure (does NOT set the global errno). Treat any non-zero
* return as :error — the caller ignores the result either way,
* the hint is informational. */
int rc = posix_fadvise(fd, 0, (off_t)len_l, POSIX_FADV_SEQUENTIAL);
if (rc != 0) {
return sym_error;
}
return sym_ok;
#else
(void)file_io; (void)rb_len;
return sym_noop;
#endif
}
|
.fast_path_kind(out_io) ⇒ Object
Returns true when the Ruby-side helper can take the fast path for ‘out_io`. Two conditions:
1. The C ext was compiled with native zero-copy (Linux / BSD /
Darwin). On other hosts `Sendfile.supported?` returns false
(defined in C); we still have a userspace fallback that's
faster than the per-chunk fiber hop, so we report :userspace
from #fast_path_kind in that case.
2. `out_io` is NOT a TLS socket. SSL sockets would need kernel-
TLS support to sendfile, which is rarely enabled.
209 210 211 212 213 214 215 216 217 218 219 |
# File 'lib/hyperion/http/sendfile.rb', line 209 def fast_path_kind(out_io) return :tls_userspace if tls_socket?(out_io) # Native sendfile needs a kernel fd on BOTH ends. StringIO and # other userspace-only IOs (custom buffer adapters in specs, # `Rack::MockResponse`, …) don't expose one — drop straight to # the userspace `IO.copy_stream` loop, which handles those. return :userspace unless real_fd?(out_io) return :native if respond_to?(:supported?) && supported? :userspace end |
.mark_splice_unsupported! ⇒ Object
Called by native_copy_loop when copy_splice reports :unsupported at runtime (very old kernel without splice(2), sandboxed environment that blocks pipe2, etc.). Flips the cached flag to false so we stop attempting splice on this process for the rest of its lifetime — falling all the way through to plain sendfile(2).
194 195 196 |
# File 'lib/hyperion/http/sendfile.rb', line 194 def mark_splice_unsupported! @splice_runtime_supported = false end |
.platform_tag ⇒ Object
Sendfile.platform_tag — returns a small Symbol describing which kernel path got compiled in. Used by specs and the bench reporter.
910 911 912 913 914 915 916 917 918 919 920 921 922 923 |
# File 'ext/hyperion_http/sendfile.c', line 910
static VALUE rb_sendfile_platform_tag(VALUE self) {
(void)self;
#if defined(HYP_SF_LINUX)
return ID2SYM(rb_intern("linux"));
#elif defined(HYP_SF_BSD)
# if defined(__APPLE__)
return ID2SYM(rb_intern("darwin"));
# else
return ID2SYM(rb_intern("bsd"));
# endif
#else
return ID2SYM(rb_intern("unsupported"));
#endif
}
|
.small_file_threshold ⇒ Object
Sendfile.small_file_threshold — exposes the C constant to Ruby.
903 904 905 906 |
# File 'ext/hyperion_http/sendfile.c', line 903
static VALUE rb_sendfile_small_threshold(VALUE self) {
(void)self;
return INT2NUM(HYP_SMALL_FILE_THRESHOLD);
}
|
.splice_runtime_supported? ⇒ Boolean
2.2.0 — runtime probe for the splice path. ‘splice_supported?` in the C ext only reports compile-time availability (true on Linux builds, false elsewhere). At runtime an old kernel can still reject splice(2) with ENOSYS / EINVAL the first time we call it; once observed, we cache the answer for the lifetime of the process so subsequent requests don’t pay the failed- syscall round-trip. Default value tracks the C ext flag so specs that assert ‘splice_supported? == true` on Linux still pass without an explicit probe; `mark_splice_unsupported!` is called by `native_copy_loop` when copy_splice surfaces :unsupported, transitioning the cached flag to false for the rest of the process.
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
# File 'lib/hyperion/http/sendfile.rb', line 159 def splice_runtime_supported? # Memoize the boot-time C ext flag. We deliberately don't # run a live pipe2+splice probe here — the production path # is the runtime probe: copy_splice_into_pipe's :unsupported # return is cheap (one pipe2 + one close pair on the first # request) and authoritative. return @splice_runtime_supported if defined?(@splice_runtime_supported) # 2.2.x fix-A — pipe2 has been hoisted out of the chunk # loop (one pipe pair per response, reused across every # chunk via `copy_splice_into_pipe`). The syscall-count # math (64 → 19 syscalls per 1 MiB request) makes the # 2.2.0 env-var gate obsolete in principle, but we leave # the gate in place until the openclaw-vm bench # re-confirms splice ≥ plain sendfile baseline on Linux. # The fix-A landing session couldn't reach openclaw-vm # (SSH auth gap, see CHANGELOG); the maintainer is # expected to drop the gate in a follow-up commit once # the bench is re-run from a session with working SSH. # Operators wanting to A/B test on other kernels can # flip HYPERION_HTTP_SPLICE=1. enabled = ENV['HYPERION_HTTP_SPLICE'] == '1' && respond_to?(:splice_supported?) && splice_supported? @splice_runtime_supported = enabled end |
.splice_supported? ⇒ Boolean
Sendfile.splice_supported? — true on Linux builds where the splice branch was compiled in. The runtime kernel may still reject splice (very old kernels return ENOSYS), in which case copy_splice surfaces :unsupported and the Ruby caller falls back to copy().
893 894 895 896 897 898 899 900 |
# File 'ext/hyperion_http/sendfile.c', line 893
static VALUE rb_sendfile_splice_supported_p(VALUE self) {
(void)self;
#ifdef HYP_SF_LINUX
return Qtrue;
#else
return Qfalse;
#endif
}
|
.supported? ⇒ Boolean
Sendfile.supported? — module-introspection helper. Lets the Ruby caller pick its branch without needing a rescue NotImplementedError around the first call (which would burn an exception object on every static response on unsupported hosts).
880 881 882 883 884 885 886 887 |
# File 'ext/hyperion_http/sendfile.c', line 880
static VALUE rb_sendfile_supported_p(VALUE self) {
(void)self;
#if defined(HYP_SF_LINUX) || defined(HYP_SF_BSD)
return Qtrue;
#else
return Qfalse;
#endif
}
|