hyperion-async-pg
Async-aware shim for the pg gem. Patches PG::Connection so exec, exec_params, exec_prepared and friends cooperate with the Async fiber scheduler — while one fiber is parked on a Postgres socket waiting for query results, other fibers in the same OS thread serve other requests. Companion to the Hyperion HTTP server. Pure Ruby, drop-in, no behavior change outside an Async scheduler.
Install
# Gemfile
gem 'hyperion-async-pg'
# config/initializers/async_pg.rb (Rails) or wherever your app boots
require 'hyperion/async_pg'
Hyperion::AsyncPg.install!
install! is idempotent and thread-safe. Call once at boot, before any DB connections are opened. It returns true on the first call, false thereafter.
Compatibility
The pg-level patch applies transparently to anything that calls into PG::Connection:
- Raw
pg(verified): your ownPG::Connection.new(...).exec_params(...)calls. This is the path the verified bench numbers below cover. - Sequel (
postgresadapter): the adapter callsexec_params/exec_prepareddirectly — patch reaches it. Pair with a fiber-aware pool, e.g.Sequel.connect(..., max_connections: N)driven from a fiber-safe pool wrapper. - ROM-sql + rom-pg: same as Sequel underneath.
- ActiveRecord 7.2 / 8.1 (verified — needs the
activerecord: trueinstall flag, see below).
No driver-side opt-in required for the exec_* patch. Patches are prepended onto PG::Connection, so every caller in the process picks them up. The catch is the connection pool the driver uses around those calls, which is the actual concurrency knob — see "Connection pool" below.
ActiveRecord: opt in with
activerecord: trueVerified Apr 2026 against AR 7.2.3.1 and AR 8.1.3, Ruby 3.3.3 + Hyperion
--async-io.AR's
ConnectionPool#lease_connectionkeys the per-caller connection lease onActiveSupport::IsolatedExecutionState.context, which by default returnsThread.current(isolation_level = :thread). Two fibers in the same OS thread therefore see the same lease, get handed the samePG::Connection, and the secondsend_query_*fires onto a busy connection —PG::UnableToSend: PQsendQuery another command is already in progress. The bench atbench/active_record.rureproduced this at 100 % 5xx, 8.6 r/s under 200 wrk connections.Fix: flip
ActiveSupport::IsolatedExecutionState.isolation_levelto:fiber. Each fiber then gets its own lease entry, AR hands out distinct connections per fiber, and the pool sizes naturally withpool:indatabase.yml. The shim ships this as a one-liner:require 'hyperion/async_pg' Hyperion::AsyncPg.install!(activerecord: true) # equivalent to: # Hyperion::AsyncPg.install! # Hyperion::AsyncPg::ActiveRecordAdapter.install!Verified 2026-04-27 on
bench/active_record.ru(50 mspg_sleepper request,pool: 32, Hyperion-w 1 -t 5 --async-io, wrk-t 4 -c 200 -d 20s): 413.24 r/s, p99 1.08 s, 0 non-2xx — a 48× rps lift over the 8.6 r/s baseline, matching the pool=32 wait-only ceiling.Sizing reminder: each in-flight fiber holds an AR connection while waiting on the socket. Set
pool:to peak fiber concurrency per worker, not thread count. Postgresmax_connectionsmust accommodatepool * worker_count.Outside a fiber scheduler (Sidekiq, plain scripts, rake, Rails console) the switch is a no-op —
Fiber.currentis each thread's root fiber, behaviorally identical to keying on the thread itself. Drop-in safe across all environments.
Server support matrix
This shim only delivers fiber concurrency when the HTTP server runs each request inside an Async::Scheduler. Without a scheduler, IO#wait_readable blocks the OS thread normally — the patch is silent and harmless, but produces no concurrency win.
| Server | Path | Concurrency win? | Notes |
|---|---|---|---|
| Falcon | any | ✅ yes | Native fiber scheduler per request. Drop-in. |
Hyperion >= 1.3.0 --async-io |
plain HTTP/1.1 | ✅ yes | Opt-in flag re-enables the Async accept loop on plain HTTP/1.1 + bypasses the thread pool so handlers run inline on the accept-loop fiber. Recommended. |
Hyperion --tls-cert ... (HTTPS h1) |
TLS / h1 | ✅ yes | TLS path always runs start_async_loop; every dispatch is a fiber. Works on 1.0.0+ — no --async-io flag needed. Verified 2026-04-27: -t 64 --tls-cert ... pool=64 = 742.7 r/s, p99 489 ms on the 50 ms pg_sleep workload (wrk -c 100 -d 20s, all 14 884 reqs 2xx). Throughput is bounded by -t N because each h1 dispatch hops through the worker pool — raise -t to match peak fiber concurrency on TLS. |
Hyperion --tls-cert ... (HTTPS h2) |
h2 streams | ✅ yes | Each h2 stream is a fiber by design. Verified 2026-04-27: same -t 64 --tls-cert ... pool=64 = 706.4 r/s, 5000/5000 succeeded, 0 errors under h2load -c 50 -m 20 -n 5000. Same -t N bound as h1+TLS. The <-t 5 configs flood protocol-http2 flow-control on >25× concurrency under stress (stream-dispatch errors); set -t ≥ peak h2 concurrency. |
Hyperion < 1.3.0 plain HTTP/1.1 |
thread pool | ❌ no | 1.2.0's perf-bypass (start_raw_loop) hands the whole socket to a worker thread with no scheduler — patch is silent. Upgrade to 1.3.0 + --async-io. |
| Puma | any | ❌ no | No fiber scheduler. Patch is silent, behaviour identical to plain pg. |
| Sidekiq / scripts / rake | any | ❌ no (and that's fine) | No scheduler → no patch effect. Drop-in safe. |
If you're on Hyperion 1.3.0+, set async_io: true in your config (or --async-io on the CLI) and pair with a fiber-aware connection pool — see next section.
Connection pool — use a fiber-aware one
The popular connection_pool gem (used by ActiveRecord, Sidekiq, etc.) is not fiber-aware: its internal Mutex + ConditionVariable don't yield to the Async scheduler. A fiber waiting for a connection blocks the entire OS thread, defeating this shim's purpose. Symptoms: throughput same as plain pg even though wait_readable is firing; under heavy load Falcon may report "Closing scheduler with blocked operations!".
This gem ships Hyperion::AsyncPg::FiberPool so you don't have to roll your own:
require 'hyperion/async_pg'
Hyperion::AsyncPg.install!
# Per-worker — call from on_worker_boot in multi-worker setups
$pg_pool = Hyperion::AsyncPg::FiberPool.new(size: 64) do
PG.connect(ENV['DATABASE_URL'])
end.fill
# In your handler
$pg_pool.with do |conn|
conn.exec_params('SELECT ...', [...])
end
Internals: FiberPool wraps Async::Semaphore (fiber-aware) around a plain Array. acquire waits for a slot via the semaphore (cooperating fibers, OS thread keeps serving others); pop/push around the Array is atomic per-thread under GVL.
Fills connections in parallel (8 threads by default) so a 64-connection pool over a 100ms-RTT WAN takes ~1s instead of ~6.4s at boot. Override with parallel_fill_threads: N (set to 1 for fully serial fill).
Alternative: async-pool
If you need lazy-on-demand connection creation, idle eviction (max-age), or graceful close on shutdown — features FiberPool doesn't ship — use async-pool instead. It's the canonical pool implementation in the Async ecosystem; same fiber-cooperation properties, broader feature set.
# Gemfile
gem 'async-pool'
require 'async'
require 'async/pool/controller'
require 'async/pool/resource'
require 'pg'
require 'hyperion/async_pg'
Hyperion::AsyncPg.install!
class PgConnectionResource < Async::Pool::Resource
def self.call
new(PG.connect(ENV['DATABASE_URL']))
end
def initialize(conn) = (super(); @conn = conn)
attr_reader :conn
def viable? = !@conn.finished?
def reusable? = viable?
def close = (@conn.close unless @conn.finished?)
end
$pg_pool = Async::Pool::Controller.new(PgConnectionResource, limit: 64)
# In your handler:
$pg_pool.acquire do |resource|
resource.conn.exec_params('SELECT ...', [...])
end
Working rackup at bench/async_pool_example.ru. Note that async-pool creates connections lazily — the first burst of N requests pays the cumulative PG.connect cost. For warm-pool semantics like FiberPool, pre-create resources in on_worker_boot (call pool.acquire { ... } limit times before serving traffic).
Other options
- A per-fiber connection (no pool) — works, but holds a connection for the fiber's lifetime; size your Postgres
max_connectionsaccordingly. - ActiveRecord — install with
Hyperion::AsyncPg.install!(activerecord: true)to make AR's pool fiber-aware (verified 413 r/s on the AR bench, see "ActiveRecord: opt in" above).
Multi-worker (-w N): use ForkSafe (0.4.0+)
Pre-fork servers (Hyperion -w N, Puma cluster mode, Falcon multi-worker) face a footgun: PG sockets opened in the master before fork are inherited by every worker via shared file descriptors. Concurrent reads/writes from multiple pids on the same kernel socket buffer corrupt the wire protocol — symptom is PG::UnableToSend: another command is already in progress on every request, ~99.99 % 5xx under load. The classic workaround is "open the pool in on_worker_boot," which forces a separate Hyperion / Puma config file.
Hyperion::AsyncPg::ForkSafe auto-detects fork via Process._fork (Ruby 3.1+) and resets registered pools in each child. Recommended for production: pass prefill_in_child: true (0.5.0+) so each child synchronously re-opens its pool inside the fork hook — eliminates the cold-start p99 spike on the first ~pool_size requests per worker.
require 'hyperion/async_pg'
Hyperion::AsyncPg.install!(fork_safe: true)
# Open the pool at boot — safe even in master with -w N.
$pg_pool = Hyperion::AsyncPg::ForkSafe.register(
Hyperion::AsyncPg::FiberPool.new(size: 64) do
PG.connect(ENV['DATABASE_URL'])
end,
prefill_in_child: true # 0.5.0+: child returns from fork(2) with a warm pool
)
$pg_pool.fill # pre-fills in master; child resets and re-fills inside Process._fork
Drop prefill_in_child: true to get 0.4.0's lazy-refill behaviour (smaller per-fork cost, but the first few requests on each worker pay pool_size × PG.connect_latency of cold-start work).
Verified bench: prefill eliminates the cold-start p99 spike
Linux openclaw-vm, Postgres 17 over WAN (~50 ms RTT), bench/pg_concurrent.ru (50 ms pg_sleep), pool=64, Hyperion --async-io -w 4 -t 5, wrk -t4 -c200 -d20s --timeout 8s. Same 0.5.0 binary, only the kwarg toggled:
| Setup | r/s | p50 | p99 | non-2xx | wrk timeouts |
|---|---|---|---|---|---|
ForkSafe.register(pool) (lazy refill) |
2089 | 51 ms | 6.02 s | 0 | 0 |
ForkSafe.register(pool, prefill_in_child: true) |
2271 | 51 ms | 583 ms | 0 | 0 |
~10× p99 reduction with equivalent throughput. The lazy-refill p99 is dominated by the first ~64 requests per worker piling up behind cold-start PG.connects. With prefill, the cost is paid during fork(2); the worker returns to the accept loop with a warm pool. Trade-off: fork itself takes longer — boot-to-first-request goes from ~9 s to ~12 s on this WAN setup (each child opens 64 connections in parallel before returning from Process._fork). Production deploys absorb that once; the per-request distribution is what your operators see all day.
Or compose explicitly:
require 'hyperion/async_pg'
require 'hyperion/async_pg/fork_safe'
Hyperion::AsyncPg.install!
Hyperion::AsyncPg::ForkSafe.install!
Kitchen-sink one-liner (everything on, AR + ForkSafe):
Hyperion::AsyncPg.install!(activerecord: true, fork_safe: true)
Both activerecord: and fork_safe: default to false — existing 0.3.0 callers see no behaviour change. prefill_in_child: defaults to false on ForkSafe.register so 0.4.0 callers see no behaviour change either.
ForkSafe cooperates with other libraries that hook Process._fork (e.g. Async::ForkHandler) by prepending its handler module onto Process.singleton_class. Method dispatch chains through every prepend in order — no captured-method recursion.
When NOT to use ForkSafe
- Ruby < 3.1:
Process._forkdoesn't exist.ForkSafe.install!warns and no-ops; fall back toon_worker_boot. - Single-worker servers:
ForkSafe.install!is a no-op penalty (one extra method dispatch onProcess._fork, which only fires onProcess.forkcalls). Harmless, but you don't need it. - You already have a working
on_worker_bootconfig: keep it —ForkSafeis for users who don't want a separate config file.
When NOT to use prefill_in_child: true
- Tight dev-loop iteration on a slow PG link: each fork waits for
pool_sizeparallelPG.connects before the worker comes online. Add 2-3 s × per-worker on a 100 ms-RTT WAN. - PG instance can't accommodate
pool_size × worker_countsimultaneous opens during a deploy: the parallel-connect storm at fork time can transiently saturatemax_connections. Stagger workers (Hyperion'sboot_workers) or shrinkpool_size. - Pool's
#fillhas side effects you don't want firing N times at fork (e.g. running migrations on first connect). Only the FiberPool factory you control decides what#filldoes; passprefill_in_child: falseif any side effect is per-pool, not per-worker.
Caveats
- Only yields under a fiber scheduler. Outside
Async { ... }(Sidekiq workers, plain scripts, rake tasks, Rails console) the patched methods behave identically to plainpg—IO#wait_readablefalls back to its blocking implementation whenFiber.schedulerisnil. There is no perf regression in non-async contexts. - Long-running statements still block the calling fiber. The shim parks a fiber on the socket; it does not preempt the running query. A 10 s
SELECTstill ties up that fiber for 10 s. Cap runaway queries with Postgresstatement_timeout(or session-levelSET statement_timeout), not at the Ruby layer. - Connection pool sizing. Under Hyperion + this shim, fibers vastly outnumber threads — each fiber can hold a checked-out DB connection while it waits on Postgres. A worker with 10 OS threads and 200 concurrent fibers can hold 200 in-flight connections. Size your
pool:(ActiveRecord) or:max_connections(Sequel) and your Postgresmax_connectionsaccordingly. Rule of thumb: pool >= peak concurrent fibers per worker. - Single-statement only. The shim drains all results and returns the last one, matching pg's default
exec_paramssemantics. Multi-statement strings sent throughexecproduce the last result, as before.
Tuning
| Env var | Default | Meaning |
|---|---|---|
HYPERION_ASYNC_PG_READ_TIMEOUT |
unset (block forever) | Seconds passed to IO#wait_readable per poll. Unset matches pg's default — rely on Postgres statement_timeout for the upper bound. Set when you want a hard ceiling on a single socket-wait independent of server-side timeouts; on timeout the shim raises PG::ConnectionBad. |
Read at every dispatch; no restart required.
Troubleshooting
macOS: objc[NNN]: +[NSCharacterSet initialize] may have been in progress in another thread when fork() was called
Falcon (and any pre-fork server) on macOS crashes if you PG.connect BEFORE the fork — pg's libpq pulls in Foundation/objc, which is not fork-safe under recent macOS. Symptom: child workers SIGABRT immediately on boot, often with the message above.
Workaround:
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
falcon serve --bind http://127.0.0.1:9292 --count 4 -c config.ru
Linux is unaffected. The cleaner fix is to defer pool creation to on_worker_boot (post-fork) — see the multi-worker section below — so no PG sockets exist in the parent at fork time.
Verified bench
Ubuntu 24.04 / 16 vCPU / Ruby 3.3.3, Postgres 17 over a WAN link, wrk -t4 -c200 -d20s. All configs single-worker (-w 1) unless noted; all returned 0 non-2xx and 0 wrk timeouts. RSS sampled mid-run via ps -o rss.
Two workloads:
- wait (
bench/pg_concurrent.ru):SELECT pg_sleep(0.05)+ tiny JSON. Pure wait-bound — the "best case" for this gem. - mix (
bench/pg_mixed.ru): same query + a 50-key JSON serialization (~5 ms CPU). The honest case — what real Rails apps look like.
Wait-bound workload
| Setup | r/s | p99 | RSS | vs Puma -t 5 |
|---|---|---|---|---|
Puma 8.0 -t 5 pool=5 |
56.5 | 3.88 s | 87 MB | 1.0× |
Puma 8.0 -t 30 pool=30 |
402.1 | 880 ms | 99 MB | 7.1× |
Puma 8.0 -t 100 pool=100 |
1067.4 | 557 ms | 121 MB | 18.9× |
Hyperion --async-io -t 5 pool=5 |
56.7 | 3.82 s | 108 MB | 1.0× |
Hyperion --async-io -t 5 pool=32 |
400.4 | 878 ms | 123 MB | 7.1× |
Hyperion --async-io -t 5 pool=64 |
778.9 | 638 ms | 133 MB | 13.8× |
Hyperion --async-io -t 5 pool=128 |
1344.2 | 536 ms | 148 MB | 23.8× |
Hyperion --async-io -t 5 pool=200 |
2381.4 | 471 ms | 164 MB | 42.2× |
Hyperion --async-io -w 4 -t 5 pool=32 |
1443.0 | 2.69 s | 503 MB | 25.5× (cold-start p99 — see note) |
Hyperion --async-io -w 4 -t 5 pool=64 |
1937.5 | 4.84 s | 416 MB | 34.3× (cold-start p99 — see note) |
Falcon 0.55.3 --count 1 pool=128 |
1665.7 | 516 ms | 141 MB | 29.5× |
Mixed CPU+wait workload (50 ms PG + 50-key JSON serialization)
| Setup | r/s | p99 | RSS | vs Puma -t 30 |
|---|---|---|---|---|
Puma 8.0 -t 30 pool=30 |
351.7 | 963 ms | 127 MB | 1.0× |
Hyperion --async-io -t 5 pool=32 |
371.2 | 919 ms | 151 MB | 1.05× |
Hyperion --async-io -t 5 pool=64 |
741.5 | 681 ms | 161 MB | 2.1× |
Hyperion --async-io -t 5 pool=128 |
1739.9 | 512 ms | 201 MB | 4.9× |
Hyperion --async-io -w 4 -t 5 pool=32 |
1303.0 | 2.99 s | 675 MB | 3.7× (cold-start p99) |
Falcon 0.55.3 --count 1 pool=128 |
1642.1 | 531 ms | 213 MB | 4.7× |
Mixed throughput slightly EXCEEDS pure-wait at high pools — pool=128 mixed (1740 r/s) beats wait-only (1344 r/s). The JSON CPU work overlaps the PG-wait windows of other fibers; a longer per-request lifetime lets the scheduler pack more in-flight requests. Counter-intuitive, real, reproducible.
What the RSS column actually tells us
Linux thread stacks are demand-paged, so Puma's worst-case "100 threads × 8 MB virtual" doesn't surface as 800 MB RSS — only ~30-40 MB of actual paged memory. At single-worker pool sizes ≤ 200, PG connection buffers (~600 KB per conn) dominate RSS, not thread stacks. So the headline difference between Hyperion and Puma is throughput + tail latency, not RSS, at this pool scale.
Where the architectural memory story DOES land:
- Connection count: Puma is capped at
max_threadsconcurrent in-flight queries. Hyperion--async-iois capped atpool_size. To match Hyperion-pool-200, Puma needs-t 200— 200 OS threads pin the address space and increase context-switch overhead, even if RSS doesn't blow up. - Idle keep-alive connections: each Puma idle keep-alive holds an OS thread. Hyperion holds a ~1 KB fiber. At 10k idle clients, this is the dramatic difference — but that's a different bench (see Hyperion's 10k-connection bench), not this one.
-w 4 cold-start caveat (and how 0.5.0 fixes it)
The -w 4 rows in the table above (2.69-4.84 s p99) come from 0.4.0's default lazy-refill: each child worker pays the full pool-fill cost on its first ~pool_size requests after fork. With pool=64 × 4 workers × ~100 ms PG.connect over WAN = ~25 s of cold-start work spread across the first batch of requests per worker. r/s is fine over the 20 s window, p99 absorbs the spike.
Fixed in 0.5.0 via ForkSafe.register(pool, prefill_in_child: true) — child re-opens the pool synchronously inside the fork hook. Side-by-side on the same hardware:
0.5.0 setup (-w 4 -t 5 pool=64) |
r/s | p99 | timeouts |
|---|---|---|---|
Lazy refill (prefill_in_child: false) |
2089 | 6.02 s | 0 |
Prefill-in-child (prefill_in_child: true) |
2271 | 583 ms | 0 |
See "Multi-worker (-w N): use ForkSafe" above. As of 0.5.0 the rackup at bench/pg_concurrent.ru defaults to prefill_in_child: true, so reproducing the table will get you the prefill numbers.
Reproduce
Reproduce
gem install hyperion-rb hyperion-async-pg pg
git clone https://github.com/andrew-woblavobla/hyperion-async-pg && cd hyperion-async-pg
bundle install
# Wait-bound (pure pg_sleep)
DATABASE_URL='postgres://user:pass@host/db' MODE=async PG_POOL_SIZE=200 \
bundle exec hyperion --async-io -t 5 -w 1 -p 9292 bench/pg_concurrent.ru &
wrk -t4 -c200 -d20s --latency http://127.0.0.1:9292/
# Mixed (pg_sleep + 50-key JSON serialization)
DATABASE_URL='...' MODE=async PG_POOL_SIZE=128 \
bundle exec hyperion --async-io -t 5 -w 1 -p 9293 bench/pg_mixed.ru &
wrk -t4 -c200 -d20s --latency http://127.0.0.1:9293/
Postgres
max_connections: pool=200 needs at least 200 free connections in your PG instance. Default is 100 — bump withALTER SYSTEM SET max_connections = 500;+ Postgres restart, or scale the pool to fit.Multi-worker (
-w N): PG sockets opened in the master are shared across forked workers; multiple worker pids reading the same kernel socket buffer interleave bytes and corrupt the wire protocol (this is why a top-level-init bench returns 99.99% 500s under-w 4). Initialize the pool inside each child — either via lazy first-request init (whatbench/pg_concurrent.rudoes) or via Hyperion'son_worker_boothook in a config file:# hyperion.rb — passed via `bundle exec hyperion -C hyperion.rb ...` on_worker_boot do require 'hyperion/async_pg' Hyperion::AsyncPg.install! $pg_pool = Hyperion::AsyncPg::FiberPool.new(size: 64) do PG.connect(ENV['DATABASE_URL']) end.fill end
on_worker_bootpre-warms the pool before the worker accepts its first request (no first-request latency spike). Lazy init is cheaper to set up and acceptable for benchmarks where the first-request cost is amortized over a 20s+ run.
Caveats
The win evaporates if any of these is wrong:
- Server doesn't run requests under
Async::Scheduler(use Hyperion--async-io, Falcon, or Hyperion-over-TLS). - Connection pool isn't fiber-aware (
connection_poolgem blocks the OS thread). - Workload isn't actually wait-bound (CPU-heavy handlers don't benefit; the gain is exactly the PG round-trip you can stack).
How it works
PG::Connection#exec_params(...) (and the other patched methods) becomes:
- Call the non-blocking
send_query_params(...)C function — fires the query off, returns immediately. - Loop:
consume_input→ checkis_busy→ if busy,socket_io.wait_readable. UnderAsync::Scheduler,wait_readableyields the fiber. Without one, it blocks the OS thread. - Drain results with
get_result, return the final one (afterresult.checkto surface errors).
No threads, no extra IO objects, no copy of the result through Ruby. The C extension does all the work; we only swap the wait primitive.
License
MIT.