Hyperion
High-performance Ruby HTTP server. Falcon-class fiber concurrency, Puma-class compatibility.
gem install hyperion-rb
bundle exec hyperion config.ru
Highlights
- HTTP/1.1 + HTTP/2 + TLS out of the box (HTTP/2 with per-stream fiber multiplexing, WINDOW_UPDATE-aware flow control, ALPN auto-negotiation).
- Pre-fork cluster with per-OS worker model:
SO_REUSEPORTon Linux, master-bind + worker-fd-share on macOS/BSD (Darwin'sSO_REUSEPORTdoesn't load-balance). - Hybrid concurrency: fiber-per-connection for I/O, OS-thread pool for
app.call(env)— synchronous Rack handlers (Rails, ActiveRecord, anything holding a global mutex) get true OS-thread concurrency. - Vendored llhttp 9.3.0 C parser; pure-Ruby fallback for non-MRI runtimes.
- Default-ON structured access logs (one JSON or text line per request) with hot-path optimisations: per-thread cached timestamp, hand-rolled line builder, lock-free per-thread write buffer.
- 12-factor logger split: info/debug → stdout, warn/error/fatal → stderr.
- Ruby DSL config file (
config/hyperion.rb) with lifecycle hooks (before_fork,on_worker_boot,on_worker_shutdown). - Object pooling for the Rack
envhash andrack.inputIO — amortizes per-request allocations across the worker's lifetime. Hyperion::FiberLocalopt-in shim for older Rails idioms that store request-scoped data viaThread.current.thread_variable_*.
Benchmarks
All numbers are real wrk runs against published Hyperion configs. Hyperion ships with default-ON structured access logs; Puma comparisons use Puma defaults (no per-request log emission). Each section is stamped with the Hyperion version it was measured against — newer versions (1.3.0+ --async-io, 1.4.0+ TLS h1 inline, 1.4.1+ Metrics fiber-key fix) preserve or improve these numbers; we re-run the headline configs each release and have not seen regressions on these workloads.
Hello-world Rack app
bench/hello.ru, single worker, parity threads (-t 5 vs Puma -t 5:5), 4 wrk threads / 100 connections / 15s, macOS arm64 / Ruby 3.3.3, Hyperion 1.2.0:
| r/s | p99 | tail vs Hyperion | |
|---|---|---|---|
| Hyperion 1.2.0 (default, logs ON) | 22,496 | 502 µs | 1× |
Falcon 0.55.3 --count 1 |
22,199 | 5.36 ms | 11× worse |
Puma 7.1.0 -t 5:5 |
20,400 | 422.85 ms | 845× worse |
Hyperion: 1.10× Puma throughput, parity with Falcon on throughput, ~10× lower p99 than Falcon and ~845× lower than Puma — while emitting structured JSON access logs the others don't.
Production cluster config (-w 4)
Same bench app, -w 4 cluster, parity threads (-t 5 everywhere), 4 wrk threads / 200 connections / 15s, macOS arm64:
| r/s | p99 | tail vs Hyperion | |
|---|---|---|---|
Falcon --count 4 |
48,197 | 4.84 ms | 5.9× worse |
Hyperion -w 4 -t 5 |
40,137 | 825 µs | 1× |
Puma -w 4 -t 5:5 |
34,793 | 177.76 ms | 215× worse (1 timeout) |
Falcon edges Hyperion ~20% on raw rps at -w 4 on macOS hello-world. Hyperion still leads on tail latency by 5.9× over Falcon and 215× over Puma, and beats Puma on throughput by 1.15×. On Linux production-config and DB-backed workloads (below) Hyperion takes the rps lead too — the macOS hello-world advantage to Falcon disappears once the workload includes any actual work or the kernel is Linux.
Linux production-config (DB-backed Rack)
-w 4 -t 10 on Ubuntu 24.04 / Ruby 3.3.3. Rack app does one Postgres SELECT 1 + one Redis GET per request, real network round-trip. wrk -t4 -c50 -d10s × 3 runs (median):
| r/s (median) | vs Puma default | |
|---|---|---|
| Hyperion default (rc17, logs ON) | 5,786 | 1.012× |
Hyperion --no-log-requests |
6,364 | 1.114× |
Puma -w 4 -t 10:10 (no per-req logs) |
5,715 | 1.000× |
Bench is wait-bound — ~3-4 ms median is the PG + Redis round-trip, dwarfing the per-request CPU work where Hyperion's optimisations live. With a synchronous pg driver, fibers don't help: every in-flight DB call still parks an OS thread, and both servers max out at workers × threads concurrent queries. To widen this gap requires either an async PG driver — see hyperion-async-pg (companion gem; pair with --async-io and a fiber-aware pool, see "Async I/O — fiber concurrency on PG-bound apps" below) — or a CPU-bound workload, where Hyperion's lead becomes visible (next section).
Async I/O — fiber concurrency on PG-bound apps
Ubuntu 24.04 / 16 vCPU / Ruby 3.3.3, Postgres 17 over WAN, wrk -t4 -c200 -d20s. Single worker (-w 1) unless noted. All configs returned 0 non-2xx and 0 timeouts. RSS sampled mid-run via ps -o rss.
Wait-bound workload (bench/pg_concurrent.ru: SELECT pg_sleep(0.05) + tiny JSON):
| r/s | p99 | RSS | vs Puma -t 5 |
|
|---|---|---|---|---|
Puma 8.0 -t 5 pool=5 |
56.5 | 3.88 s | 87 MB | 1.0× |
Puma 8.0 -t 30 pool=30 |
402.1 | 880 ms | 99 MB | 7.1× |
Puma 8.0 -t 100 pool=100 |
1067.4 | 557 ms | 121 MB | 18.9× |
Hyperion --async-io -t 5 pool=32 |
400.4 | 878 ms | 123 MB | 7.1× |
Hyperion --async-io -t 5 pool=64 |
778.9 | 638 ms | 133 MB | 13.8× |
Hyperion --async-io -t 5 pool=128 |
1344.2 | 536 ms | 148 MB | 23.8× |
Hyperion --async-io -t 5 pool=200 |
2381.4 | 471 ms | 164 MB | 42.2× |
Hyperion --async-io -w 4 -t 5 pool=64 |
1937.5 | 4.84 s | 416 MB | 34.3× (cold-start p99 — see note) |
Falcon 0.55.3 --count 1 pool=128 |
1665.7 | 516 ms | 141 MB | 29.5× |
Mixed CPU+wait (bench/pg_mixed.ru: same query + 50-key JSON serialization, ~5 ms CPU):
| r/s | p99 | RSS | vs Puma -t 30 |
|
|---|---|---|---|---|
Puma 8.0 -t 30 pool=30 |
351.7 | 963 ms | 127 MB | 1.0× |
Hyperion --async-io -t 5 pool=32 |
371.2 | 919 ms | 151 MB | 1.05× |
Hyperion --async-io -t 5 pool=64 |
741.5 | 681 ms | 161 MB | 2.1× |
Hyperion --async-io -t 5 pool=128 |
1739.9 | 512 ms | 201 MB | 4.9× |
Falcon --count 1 pool=128 |
1642.1 | 531 ms | 213 MB | 4.7× |
Takeaways:
- Linear scaling with pool size under
--async-io—r/s ≈ pool × 12on this WAN bench. Single-worker pool=200 hits 2381 r/s, 42× Puma-t 5and 5.9× Puma's best (-t 30). - Mixed workload doesn't kill the win — Hyperion
--async-iopool=128 actually goes up on mixed (1740 vs 1344 r/s) because CPU work overlaps other fibers' PG-wait windows. This is the honest "what happens to a real Rails handler" answer. - Hyperion ≈ Falcon within 3-7% across pool sizes; both fiber-native architectures extract similar value from
hyperion-async-pg. - RSS at single-worker scale isn't the architectural moat — Linux thread stacks are demand-paged; PG connection buffers dominate RSS at pool sizes ≤ 200. The architectural win is handler concurrency under load, not idle memory: Hyperion's fiber path runs thousands of in-flight handler invocations per OS thread, so wait-bound handlers don't queue at
max_threads. See Concurrency at scale for both the throughput-under-load row and a measured 10k-idle-keepalive RSS sweep against Puma and Falcon. -w 4cold-start caveat — multi-worker p99 inflates because the bench rackup uses lazy per-process pool init (each worker pays full pool fill on its first request). Production apps avoid this withon_worker_boot { Hyperion::AsyncPg::FiberPool.new(...).fill }.
Three things must all be true to get this win:
async_io: truein your Hyperion config (or--async-ioCLI flag). Default is off to keep 1.2.0's raw-loop perf for fiber-unaware apps.hyperion-async-pginstalled:gem 'hyperion-async-pg', require: 'hyperion/async_pg'+Hyperion::AsyncPg.install!at boot.- Fiber-aware connection pool. The popular
connection_poolgem is NOT — its Mutex blocks the OS thread. UseHyperion::AsyncPg::FiberPool(ships with hyperion-async-pg 0.3.0+),async-pool, orAsync::Semaphore.
Skip any of these and you get parity with Puma at the same -t. Run the bench yourself: MODE=async DATABASE_URL=... PG_POOL_SIZE=200 bundle exec hyperion --async-io -t 5 bench/pg_concurrent.ru (in the hyperion-async-pg repo).
TLS + async-pg note (1.4.0+). TLS / HTTPS already runs each connection on a fiber under
Async::Scheduler(the TLS path always usesstart_async_loopfor the ALPN handshake). As of 1.4.0, the post-handshakeapp.callfor HTTP/1.1-over-TLS dispatches inline on the calling fiber by default — so fiber-cooperative libraries (hyperion-async-pg,async-redis) work on the TLS h1 path without needing--async-io. The Async-loop cost is already paid for the handshake; running the handler under the existing scheduler just preserves that context instead of stripping it on a thread-pool hop. h2 streams are always fiber-dispatched and benefit from async-pg without the flag.Operators who specifically want TLS + threadpool dispatch (e.g. CPU-heavy handlers competing for OS threads, where you'd rather not pay fiber yields and want true OS-thread parallelism on a synchronous handler) can pass
async_io: falsein the config to force the pool branch back on. The three-wayasync_iosetting:
nil(default): plain HTTP/1.1 → pool, TLS h1 → inline.true: plain HTTP/1.1 → inline, TLS h1 → inline (force fiber dispatch everywhere; needed forhyperion-async-pgon plain HTTP).false: plain HTTP/1.1 → pool, TLS h1 → pool (explicit opt-out for TLS+threadpool).
CPU-bound JSON workload
bench/work.ru — handler builds a 50-key fixture, JSON-encodes a fresh response per request (~8 KB body), processes a 6-cookie header chain. wrk -t4 -c200 -d15s, macOS arm64 / Ruby 3.3.3, 1.2.0:
| r/s | p99 | tail vs Hyperion | |
|---|---|---|---|
Falcon --count 4 |
46,166 | 20.17 ms | 24× worse |
Hyperion -w 4 -t 5 |
43,924 | 824 µs | 1× |
Puma -w 4 -t 5:5 |
36,383 | 166.30 ms (47 socket errors) | 200× worse |
1.21× Puma throughput, 200× lower p99. This is the gap that hides behind PG-round-trip noise on the DB bench. Hyperion's per-request CPU savings (lock-free per-thread metrics, frozen header keys in the Rack adapter, C-ext response head builder, cached iso8601 timestamps, cached HTTP Date header) land on the wire when the workload is CPU-bound. Falcon edges us 5% on raw r/s but with 24× worse tail — a different tradeoff curve. Reproduce: bundle exec bin/hyperion -w 4 -t 5 -p 9292 bench/work.ru.
Real Rails 8.1 app (single worker, parity threads -t 16)
Health endpoint that traverses the full middleware chain (rack-attack, locale redirect, structured tagger, geo-location, etc.). Plus a Grape API endpoint reading cached data, and a Rails controller doing a Redis GET + an ActiveRecord query.
| endpoint | server | r/s | p99 | wrk timeouts |
|---|---|---|---|---|
/up (health) |
Hyperion | 19.03 | 1.12 s | 0 |
/up (health) |
Puma -t 16:16 |
16.64 | 1.95 s | 138 |
Grape /api/v1/cached_data |
Hyperion | 16.15 | 779 ms | 16 |
Grape /api/v1/cached_data |
Puma -t 16:16 |
10.90 | (>2 s, censored) | 110 |
Rails /api/v1/health |
Hyperion | 15.95 | 992 ms | 16 |
Rails /api/v1/health |
Puma -t 16:16 |
11.29 | (>2 s, censored) | 114 |
On Grape and Rails-controller workloads Puma hits wrk's 2 s timeout cap on ~⅔ of requests — its real p99 is censored above 2 s. Hyperion serves all of its requests under 1.2 s with 0 to 16 timeouts. 1.14–1.48× Puma throughput depending on endpoint.
Static-asset serving (sendfile zero-copy path, 1.2.0+)
bench/static.ru (Rack::Files over a 1 MiB asset), -w 1, wrk -t4 -c100 -d15s, macOS arm64 / Ruby 3.3.3:
| r/s | p99 | transferred | tail vs winner | |
|---|---|---|---|---|
| Hyperion (sendfile path) | 2,069 | 3.10 ms | 30.4 GB | 1× |
Puma -w 1 -t 5:5 |
2,109 | 566.16 ms | 31.0 GB | 183× worse |
Falcon --count 1 |
1,269 | 801.01 ms | 18.7 GB | 258× worse (28 timeouts) |
Throughput is bandwidth-bound on localhost (≈2 GB/s = the loopback memory ceiling), so the throughput column looks like parity. The actual win is in the tail latency column: Hyperion's IO.copy_stream → sendfile(2) path skips userspace entirely, while Puma allocates a String per response and Falcon serializes more aggressively. On real network paths sendfile widens the gap further (kernel-to-NIC zero-copy).
Reproduce:
ruby -e 'File.binwrite("/tmp/hyperion_bench_asset_1m.bin", "x" * (1024*1024))'
bundle exec bin/hyperion -p 9292 bench/static.ru
wrk --latency -t4 -c100 -d15s http://127.0.0.1:9292/hyperion_bench_asset_1m.bin
Concurrency at scale (architectural advantages)
These workloads demonstrate structural differences between Hyperion's fiber-per-connection / fiber-per-stream model and Puma's thread-pool model. Numbers are illustrative; the architecture is what matters. Run on Ubuntu 24.04 / Ruby 3.3.3, single worker, h2load -c <conns> -n 100000 --rps 1000 --h1.
5,000 concurrent keep-alive connections (50,000 requests):
| succeeded | r/s | wall | master RSS | |
|---|---|---|---|---|
Hyperion -w 1 -t 10 |
50,000 / 50,000 | 3,460 | 14.45 s | 53.5 MB |
Puma -w 1 -t 10:10 |
50,000 / 50,000 | 1,762 | 28.37 s | 36.9 MB |
10,000 concurrent keep-alive connections (100,000 requests):
| succeeded | failed | r/s | wall | |
|---|---|---|---|---|
Hyperion -w 1 -t 10 |
93,090 | 6,910 | 3,446 | 27.01 s |
Puma -w 1 -t 10:10 |
77,340 | 22,660 | 706 | 109.59 s |
At 10k concurrent connections under load Hyperion serves ~5× the throughput of Puma with ~20% fewer dropped requests. The per-connection bookkeeping cost is bounded by fiber size, not by max_threads — workers don't get pinned to long-lived sockets, so a slow handler doesn't starve other connections.
Memory at idle keep-alive scale — 10,000 idle HTTP/1.1 keep-alive connections:
Each client opens a TCP connection, sends one keep-alive GET, drains the response, then holds the socket open without sending a follow-up request. RSS is sampled once a second across a 30s idle hold. Same hello-world rackup, single worker, no TLS. Hyperion runs with async_io true (fiber-per-connection on the plain HTTP/1.1 path).
| held | dropped | peak RSS | RSS after drain | |
|---|---|---|---|---|
Hyperion -w 1 -t 5 --async-io |
10,000 / 10,000 | 0 | 173 MB | 155 MB |
Puma -w 0 -t 100 |
10,000 / 10,000 | 0 | 101 MB | 104 MB |
Falcon --count 1 |
10,000 / 10,000 | 0 | 429 MB | 440 MB |
All three hold 10k idle conns without OOMing or dropping — the "MB-per-thread" intuition that thread-based servers can't reach this scale doesn't survive contact with Linux's demand-paged thread stacks plus Puma's reactor-based keep-alive handling. Per-conn RSS lands at ~14 KB (Hyperion fiber + parser state), ~7 KB (Puma reactor entry + tiny thread share), ~36 KB (Falcon Async::Task + protocol-http stack). Bounded, not unbounded — for all three.
The architectural difference shows up under load, not at idle: Puma can only run max_threads handler invocations concurrently, so wait-bound handlers (DB, HTTP, Redis) starve at higher request concurrency than max_threads. Hyperion's fiber-per-connection model + --async-io gives one OS thread thousands of in-flight handler executions, paired with hyperion-async-pg for non-blocking DB. The 10k-conn throughput row above (5× Puma) is the consequence — same idle RSS shape, very different behaviour once the handlers actually do work.
HTTP/2 multiplexing — 1 connection × 100 concurrent streams (handler sleeps 50 ms):
| wall time | |
|---|---|
| Hyperion (per-stream fiber dispatch) | 1.04 s |
| Serial baseline (100 × 50 ms) | 5.00 s |
Hyperion fans 100 in-flight streams across separate fibers within a single TCP connection. A serial server would take 5 s; the fiber-multiplexed result (1.04 s, ~96 req/s on one socket) is bounded by single-handler sleep time plus framing overhead. Puma has no native HTTP/2 path — production deployments terminate h2 at nginx and forward h1 to the worker pool, which serializes again.
1.6.0 outbound write path —
Http2Handlerno longer serializes every framer write through oneMutex#synchronize { socket.write(...) }. HPACK encoding (microseconds, in-memory) still serializes on a fast encode mutex, but the actualsocket.writeis owned by a dedicated per-connection writer fiber draining a queue. On per-connection multi-stream workloads where the kernel send buffer or peer reads are slow, encode work for ready streams overlaps the writer's flush of earlier chunks, instead of stacking up behind it. Seebench/h2_streams.sh(h2load -c 1 -m 100 -n 5000) for a recipe to compare 1.5.0 vs 1.6.0 on a workload of your choice.
Reproduce
# hello-world
bundle exec ruby bench/compare.rb
HYPERION_WORKERS=4 PUMA_WORKERS=4 FALCON_COUNT=4 bundle exec ruby bench/compare.rb
# Idle keep-alive RSS sweep (1k / 5k / 10k conns, 30s hold per server)
./bench/keepalive_memory.sh
# Real Rails / Grape: see bench/db.ru for the schema
Quick start
bundle install
bundle exec rake compile # build the llhttp C ext
bundle exec hyperion config.ru # single-process default
bundle exec hyperion -w 4 -t 10 config.ru # 4-worker cluster, 10 threads each
bundle exec hyperion -w 0 config.ru # 1 worker per CPU
bundle exec hyperion --tls-cert cert.pem --tls-key key.pem -p 9443 config.ru # HTTPS
curl http://127.0.0.1:9292/ # => hello
# Chunked POST works:
curl -X POST -H "Transfer-Encoding: chunked" --data-binary @file http://127.0.0.1:9292/
# HTTP/2 (over TLS, ALPN-negotiated):
curl --http2 -k https://127.0.0.1:9443/
bundle exec rake spec (and the default task) auto-invoke compile, so a fresh checkout just needs bundle install && bundle exec rake to get a green run.
Migrating from Puma? See docs/MIGRATING_FROM_PUMA.md.
Configuration
Three layers, in precedence order: explicit CLI flag > environment variable > config/hyperion.rb > built-in default.
CLI flags
| Flag | Default | Notes |
|---|---|---|
-b, --bind HOST |
127.0.0.1 |
|
-p, --port PORT |
9292 |
|
-w, --workers N |
1 |
0 → Etc.nprocessors |
-t, --threads N |
5 |
OS-thread Rack handler pool per worker. 0 → run inline (no pool, debugging only). |
-C, --config PATH |
config/hyperion.rb if present |
Ruby DSL file. |
--tls-cert PATH |
nil | PEM certificate. |
--tls-key PATH |
nil | PEM private key. |
--log-level LEVEL |
info |
debug / info / warn / error / fatal. |
--log-format FORMAT |
auto |
text / json / auto. Auto: JSON when RAILS_ENV/RACK_ENV is production/staging, colored text on TTY, JSON otherwise. |
--[no-]log-requests |
ON | Per-request access log. |
--fiber-local-shim |
off | Patches Thread#thread_variable_* to fiber storage for older Rails idioms. |
--[no-]yjit |
auto | Force YJIT on/off. Default: auto-on under RAILS_ENV/RACK_ENV = production/staging. |
--[no-]async-io |
off | Run plain HTTP/1.1 connections under Async::Scheduler. Required for hyperion-async-pg on plain HTTP. TLS h1 / HTTP/2 always run under the scheduler regardless. |
--max-body-bytes BYTES |
16777216 (16 MiB) |
Maximum request body size. |
--max-header-bytes BYTES |
65536 (64 KiB) |
Maximum total request-header size. |
--max-pending COUNT |
unbounded | Per-worker accept-queue cap before new connections are rejected with HTTP 503 + Retry-After: 1. |
--max-request-read-seconds SECONDS |
60 |
Total wallclock budget for reading request line + headers + body for ONE request. Slowloris defence. |
--admin-token TOKEN |
unset | Bearer token for POST /-/quit and GET /-/metrics. Production: prefer --admin-token-file — argv is visible via ps. |
--admin-token-file PATH |
unset | Read the admin token from a file. Refuses to load if the file is missing or world-readable (mode must mask 0o007). |
--worker-max-rss-mb MB |
unset | Master gracefully recycles a worker once its RSS exceeds this many megabytes. nil = disabled. |
--idle-keepalive SECONDS |
5 |
Keep-alive idle timeout. Connection closes after this many seconds of inactivity. |
--graceful-timeout SECONDS |
30 |
Shutdown deadline before SIGKILL is delivered to a worker that hasn't drained. |
Environment variables
HYPERION_LOG_LEVEL, HYPERION_LOG_FORMAT, HYPERION_LOG_REQUESTS (0|1|true|false|yes|no|on|off), HYPERION_ENV, HYPERION_WORKER_MODEL (share|reuseport).
Config file
config/hyperion.rb — same shape as Puma's puma.rb. Auto-loaded if present.
# config/hyperion.rb
bind '0.0.0.0'
port 9292
workers 4
thread_count 10
# tls_cert_path 'config/cert.pem'
# tls_key_path 'config/key.pem'
read_timeout 30
idle_keepalive 5
graceful_timeout 30
max_header_bytes 64 * 1024
max_body_bytes 16 * 1024 * 1024
log_level :info
log_format :auto
log_requests true
fiber_local_shim false
async_io nil # Three-way (1.4.0+): nil (default, auto: inline-on-fiber for TLS h1, pool hop for plain HTTP/1.1), true (force inline-on-fiber everywhere — required for hyperion-async-pg on plain HTTP/1.1), false (force pool hop everywhere — explicit opt-out for TLS+threadpool with CPU-heavy handlers). ~5% throughput hit on hello-world when inline; in exchange one OS thread serves N concurrent in-flight DB queries on wait-bound workloads. TLS / HTTP/2 accept loops always run under Async::Scheduler regardless of this flag.
before_fork do
ActiveRecord::Base.connection_handler.clear_all_connections! if defined?(ActiveRecord)
end
on_worker_boot do |worker_index|
ActiveRecord::Base.establish_connection if defined?(ActiveRecord)
end
on_worker_shutdown do |worker_index|
ActiveRecord::Base.connection_handler.clear_all_connections! if defined?(ActiveRecord)
end
Strict DSL: unknown methods raise NoMethodError at boot — typos surface immediately rather than getting silently ignored.
A documented sample lives at config/hyperion.example.rb.
Logging
Default behaviour (rc16+):
info/debug→ stdout,warn/error/fatal→ stderr (12-factor).- One structured access-log line per response, info level, on stdout. Disable with
--no-log-requestsorHYPERION_LOG_REQUESTS=0. - Format auto-selects: production envs → JSON (line-delimited, parseable by every log aggregator); TTY → coloured text; piped output without env hint → JSON.
Sample access log lines
Text format (TTY default):
2026-04-26T18:40:04.112Z INFO [hyperion] message=request method=GET path=/api/v1/health status=200 duration_ms=46.63 remote_addr=127.0.0.1 http_version=HTTP/1.1
2026-04-26T18:40:04.123Z INFO [hyperion] message=request method=GET path=/api/v1/cached_data query="currency=USD" status=200 duration_ms=43.87 remote_addr=127.0.0.1 http_version=HTTP/1.1
JSON format (auto-selected on RAILS_ENV=production/staging or piped output):
{"ts":"2026-04-26T18:38:49.405Z","level":"info","source":"hyperion","message":"request","method":"GET","path":"/api/v1/health","status":200,"duration_ms":46.63,"remote_addr":"127.0.0.1","http_version":"HTTP/1.1"}
{"ts":"2026-04-26T18:38:49.411Z","level":"info","source":"hyperion","message":"request","method":"GET","path":"/api/v1/cached_data","query":"currency=USD","status":200,"duration_ms":40.64,"remote_addr":"127.0.0.1","http_version":"HTTP/1.1"}
Hot-path optimisations
The default-ON access log path is engineered to stay near-zero cost:
- Per-thread cached
iso8601(3)timestamp — one allocation per millisecond per thread, reused across all requests in that millisecond. - Hand-rolled single-interpolation line builder — bypasses generic
Hash#map.join. - Per-thread 4 KiB write buffer — flushes to stdout when full or on connection close. Cuts ~32× the syscalls under load.
- Lock-free emit — POSIX
write(2)is atomic for writes ≤ PIPE_BUF (4096 B); a log line is ~200 B. No logger mutex.
Metrics
Hyperion.stats returns a snapshot Hash with the following counters (lock-free per-thread aggregation):
| Counter | Meaning |
|---|---|
connections_accepted |
Lifetime accept count. |
connections_active |
Currently in-flight connections. |
requests_total |
Lifetime request count. |
requests_in_flight |
Currently in-flight requests. |
responses_<code> |
One counter per status code emitted (responses_200, responses_400, …). |
parse_errors |
HTTP parse failures → 400. |
app_errors |
Rack app raised → 500. |
read_timeouts |
Per-connection read deadline hit. |
requests_threadpool_dispatched |
HTTP/1.1 connection handed to the worker pool (or served inline in start_raw_loop when thread_count: 0). The default dispatch path. |
requests_async_dispatched |
HTTP/1.1 connection served inline on the accept-loop fiber under --async-io. Operators can use the ratio against requests_threadpool_dispatched to verify fiber-cooperative I/O is actually engaged. |
require 'hyperion'
Hyperion.stats
# => {connections_accepted: 1234, connections_active: 7, requests_total: 8910, …}
Prometheus exporter
When admin_token is set in your config, Hyperion mounts a /-/metrics endpoint that emits Prometheus text-format v0.0.4. Same token guards both /-/metrics (GET) and /-/quit (POST); auth is via the X-Hyperion-Admin-Token header.
$ curl -s -H 'X-Hyperion-Admin-Token: secret' http://127.0.0.1:9292/-/metrics
# HELP hyperion_requests_total Total HTTP requests handled
# TYPE hyperion_requests_total counter
hyperion_requests_total 8910
# HELP hyperion_bytes_written_total Total bytes written to response sockets
# TYPE hyperion_bytes_written_total counter
hyperion_bytes_written_total 2351023
# HELP hyperion_responses_status_total Responses by HTTP status code
# TYPE hyperion_responses_status_total counter
hyperion_responses_status_total{status="200"} 8521
hyperion_responses_status_total{status="404"} 12
hyperion_responses_status_total{status="500"} 3
# … and so on for sendfile_responses_total, rejected_connections_total,
# slow_request_aborts_total, requests_async_dispatched_total, etc.
Any counter not in the known set (added by app middleware via Hyperion.metrics.increment(:custom_thing)) is auto-exported as hyperion_custom_thing with a generic HELP line — no Hyperion config change required.
Point your scraper at it: in Prometheus' scrape_configs, set metrics_path: /-/metrics and bearer_token (or use a custom header relabel — Prometheus 2.42+ supports authorization.credentials_file paired with a custom header block). Network-isolate the admin endpoints if the listener is internet-facing — see docs/REVERSE_PROXY.md for the nginx location /-/ { return 404; } recipe.
TLS + HTTP/2
Provide a PEM cert + key:
bundle exec hyperion --tls-cert config/cert.pem --tls-key config/key.pem -p 9443 config.ru
ALPN auto-negotiates h2 (HTTP/2) or http/1.1 per connection. HTTP/2 multiplexes streams onto fibers within a single connection — slow handlers don't head-of-line-block other streams. Cluster-mode TLS works (-w N + --tls-cert / --tls-key).
Smuggling defenses for HTTP/1.1: Content-Length + Transfer-Encoding together → 400; non-chunked Transfer-Encoding → 501; CRLF in response header values → ArgumentError (response-splitting guard).
Compatibility
- Ruby 3.3+ required (the
protocol-http2 ~> 0.26transitive dep imposes this floor; older Ruby installs error atbundle install). - Rack 3 (auto-sets
SERVER_SOFTWARE,rack.version,REMOTE_ADDR, IPv6-safeHostparsing, CRLF guard). Hyperion::FiberLocal.install!opt-in shim for older Rails apps that store request-scoped data viaThread.current.thread_variable_*(modern Rails 7.1+ already uses Fiber storage natively; the shim handles the residual footgun).Hyperion::FiberLocal.verify_environment!runtime check thatThread.current[:k]is fiber-local on the current Ruby (it is on 3.2+).
Credits
- Vendored llhttp (Node.js's HTTP parser, MIT) under
ext/hyperion_http/llhttp/. - HTTP/2 framing and HPACK via
protocol-http2. - Fiber scheduler via
async.
License
MIT.