hyperion-async-pg

Async-aware shim for the pg gem. Patches PG::Connection so exec, exec_params, exec_prepared and friends cooperate with the Async fiber scheduler — while one fiber is parked on a Postgres socket waiting for query results, other fibers in the same OS thread serve other requests. Companion to the Hyperion HTTP server. Pure Ruby, drop-in, no behavior change outside an Async scheduler.

Install

# Gemfile
gem 'hyperion-async-pg'

# config/initializers/async_pg.rb (Rails) or wherever your app boots
require 'hyperion/async_pg'
Hyperion::AsyncPg.install!

install! is idempotent and thread-safe. Call once at boot, before any DB connections are opened. It returns true on the first call, false thereafter.

Compatibility

The pg-level patch applies transparently to anything that calls into PG::Connection:

Raw pg (verified): your own PG::Connection.new(...).exec_params(...) calls. This is the path the verified bench numbers below cover.
Sequel (postgres adapter): the adapter calls exec_params / exec_prepared directly — patch reaches it. Pair with a fiber-aware pool, e.g. Sequel.connect(..., max_connections: N) driven from a fiber-safe pool wrapper.
ROM-sql + rom-pg: same as Sequel underneath.
ActiveRecord 7.2 / 8.1 (verified — needs the activerecord: true install flag, see below).

No driver-side opt-in required for the exec_* patch. Patches are prepended onto PG::Connection, so every caller in the process picks them up. The catch is the connection pool the driver uses around those calls, which is the actual concurrency knob — see "Connection pool" below.

ActiveRecord: opt in with activerecord: true

Verified Apr 2026 against AR 7.2.3.1 and AR 8.1.3, Ruby 3.3.3 + Hyperion --async-io.

AR's ConnectionPool#lease_connection keys the per-caller connection lease on ActiveSupport::IsolatedExecutionState.context, which by default returns Thread.current (isolation_level = :thread). Two fibers in the same OS thread therefore see the same lease, get handed the same PG::Connection, and the second send_query_* fires onto a busy connection — PG::UnableToSend: PQsendQuery another command is already in progress. The bench at bench/active_record.ru reproduced this at 100 % 5xx, 8.6 r/s under 200 wrk connections.

Fix: flip ActiveSupport::IsolatedExecutionState.isolation_level to :fiber. Each fiber then gets its own lease entry, AR hands out distinct connections per fiber, and the pool sizes naturally with pool: in database.yml. The shim ships this as a one-liner:
require 'hyperion/async_pg'
Hyperion::AsyncPg.install!(activerecord: true)
# equivalent to:
#   Hyperion::AsyncPg.install!
#   Hyperion::AsyncPg::ActiveRecordAdapter.install!
Verified 2026-04-27 on bench/active_record.ru (50 ms pg_sleep per request, pool: 32, Hyperion -w 1 -t 5 --async-io, wrk -t 4 -c 200 -d 20s): 413.24 r/s, p99 1.08 s, 0 non-2xx — a 48× rps lift over the 8.6 r/s baseline, matching the pool=32 wait-only ceiling.

Sizing reminder: each in-flight fiber holds an AR connection while waiting on the socket. Set pool: to peak fiber concurrency per worker, not thread count. Postgres max_connections must accommodate pool * worker_count.

Outside a fiber scheduler (Sidekiq, plain scripts, rake, Rails console) the switch is a no-op — Fiber.current is each thread's root fiber, behaviorally identical to keying on the thread itself. Drop-in safe across all environments.

Server support matrix

This shim only delivers fiber concurrency when the HTTP server runs each request inside an Async::Scheduler. Without a scheduler, IO#wait_readable blocks the OS thread normally — the patch is silent and harmless, but produces no concurrency win.

Server	Path	Concurrency win?	Notes
Falcon	any	✅ yes	Native fiber scheduler per request. Drop-in.
Hyperion `>= 1.3.0` `--async-io`	plain HTTP/1.1	✅ yes	Opt-in flag re-enables the Async accept loop on plain HTTP/1.1 + bypasses the thread pool so handlers run inline on the accept-loop fiber. Recommended.
Hyperion `--tls-cert ...` (HTTPS h1)	TLS / h1	✅ yes	TLS path always runs `start_async_loop`; every dispatch is a fiber. Works on 1.0.0+ — no `--async-io` flag needed. Verified 2026-04-27: `-t 64 --tls-cert ...` pool=64 = 742.7 r/s, p99 489 ms on the 50 ms `pg_sleep` workload (wrk `-c 100 -d 20s`, all 14 884 reqs 2xx). Throughput is bounded by `-t N` because each h1 dispatch hops through the worker pool — raise `-t` to match peak fiber concurrency on TLS.
Hyperion `--tls-cert ...` (HTTPS h2)	h2 streams	✅ yes	Each h2 stream is a fiber by design. Verified 2026-04-27: same `-t 64 --tls-cert ...` pool=64 = 706.4 r/s, 5000/5000 succeeded, 0 errors under h2load `-c 50 -m 20 -n 5000`. Same `-t N` bound as h1+TLS. The `<-t 5` configs flood `protocol-http2` flow-control on >25× concurrency under stress (stream-dispatch errors); set `-t` ≥ peak h2 concurrency.
Hyperion `< 1.3.0` plain HTTP/1.1	thread pool	❌ no	1.2.0's perf-bypass (`start_raw_loop`) hands the whole socket to a worker thread with no scheduler — patch is silent. Upgrade to 1.3.0 + `--async-io`.
Puma	any	❌ no	No fiber scheduler. Patch is silent, behaviour identical to plain pg.
Sidekiq / scripts / rake	any	❌ no (and that's fine)	No scheduler → no patch effect. Drop-in safe.

If you're on Hyperion 1.3.0+, set async_io: true in your config (or --async-io on the CLI) and pair with a fiber-aware connection pool — see next section.

Connection pool — use a fiber-aware one

The popular connection_pool gem (used by ActiveRecord, Sidekiq, etc.) is not fiber-aware: its internal Mutex + ConditionVariable don't yield to the Async scheduler. A fiber waiting for a connection blocks the entire OS thread, defeating this shim's purpose. Symptoms: throughput same as plain pg even though wait_readable is firing; under heavy load Falcon may report "Closing scheduler with blocked operations!".

This gem ships Hyperion::AsyncPg::FiberPool so you don't have to roll your own:

require 'hyperion/async_pg'
Hyperion::AsyncPg.install!

# Per-worker — call from on_worker_boot in multi-worker setups
$pg_pool = Hyperion::AsyncPg::FiberPool.new(size: 64) do
  PG.connect(ENV['DATABASE_URL'])
end.fill

# In your handler
$pg_pool.with do |conn|
  conn.exec_params('SELECT ...', [...])
end

Internals: FiberPool wraps Async::Semaphore (fiber-aware) around a plain Array. acquire waits for a slot via the semaphore (cooperating fibers, OS thread keeps serving others); pop/push around the Array is atomic per-thread under GVL.

Fills connections in parallel (8 threads by default) so a 64-connection pool over a 100ms-RTT WAN takes ~1s instead of ~6.4s at boot. Override with parallel_fill_threads: N (set to 1 for fully serial fill).

Alternative: `async-pool`

If you need lazy-on-demand connection creation, idle eviction (max-age), or graceful close on shutdown — features FiberPool doesn't ship — use async-pool instead. It's the canonical pool implementation in the Async ecosystem; same fiber-cooperation properties, broader feature set.

# Gemfile
gem 'async-pool'

require 'async'
require 'async/pool/controller'
require 'async/pool/resource'
require 'pg'
require 'hyperion/async_pg'
Hyperion::AsyncPg.install!

class PgConnectionResource < Async::Pool::Resource
  def self.call
    new(PG.connect(ENV['DATABASE_URL']))
  end

  def initialize(conn) = (super(); @conn = conn)
  attr_reader :conn

  def viable?  = !@conn.finished?
  def reusable? = viable?
  def close    = (@conn.close unless @conn.finished?)
end

$pg_pool = Async::Pool::Controller.new(PgConnectionResource, limit: 64)

# In your handler:
$pg_pool.acquire do |resource|
  resource.conn.exec_params('SELECT ...', [...])
end

Working rackup at bench/async_pool_example.ru. Note that async-pool creates connections lazily — the first burst of N requests pays the cumulative PG.connect cost. For warm-pool semantics like FiberPool, pre-create resources in on_worker_boot (call pool.acquire { ... } limit times before serving traffic).

Other options

A per-fiber connection (no pool) — works, but holds a connection for the fiber's lifetime; size your Postgres max_connections accordingly.
ActiveRecord — install with Hyperion::AsyncPg.install!(activerecord: true) to make AR's pool fiber-aware (verified 413 r/s on the AR bench, see "ActiveRecord: opt in" above).

Multi-worker (`-w N`): use `ForkSafe` (0.4.0+)

Pre-fork servers (Hyperion -w N, Puma cluster mode, Falcon multi-worker) face a footgun: PG sockets opened in the master before fork are inherited by every worker via shared file descriptors. Concurrent reads/writes from multiple pids on the same kernel socket buffer corrupt the wire protocol — symptom is PG::UnableToSend: another command is already in progress on every request, ~99.99 % 5xx under load. The classic workaround is "open the pool in on_worker_boot," which forces a separate Hyperion / Puma config file.

Hyperion::AsyncPg::ForkSafe auto-detects fork via Process._fork (Ruby 3.1+) and resets registered pools in each child. Recommended for production: pass prefill_in_child: true (0.5.0+) so each child synchronously re-opens its pool inside the fork hook — eliminates the cold-start p99 spike on the first ~pool_size requests per worker.

require 'hyperion/async_pg'
Hyperion::AsyncPg.install!(fork_safe: true)

# Open the pool at boot — safe even in master with -w N.
$pg_pool = Hyperion::AsyncPg::ForkSafe.register(
  Hyperion::AsyncPg::FiberPool.new(size: 64) do
    PG.connect(ENV['DATABASE_URL'])
  end,
  prefill_in_child: true   # 0.5.0+: child returns from fork(2) with a warm pool
)
$pg_pool.fill   # pre-fills in master; child resets and re-fills inside Process._fork

Drop prefill_in_child: true to get 0.4.0's lazy-refill behaviour (smaller per-fork cost, but the first few requests on each worker pay pool_size × PG.connect_latency of cold-start work).

Verified bench: prefill eliminates the cold-start p99 spike

Linux openclaw-vm, Postgres 17 over WAN (~50 ms RTT), bench/pg_concurrent.ru (50 ms pg_sleep), pool=64, Hyperion --async-io -w 4 -t 5, wrk -t4 -c200 -d20s --timeout 8s. Same 0.5.0 binary, only the kwarg toggled:

Setup	r/s	p50	p99	non-2xx	wrk timeouts
`ForkSafe.register(pool)` (lazy refill)	2089	51 ms	6.02 s	0	0
`ForkSafe.register(pool, prefill_in_child: true)`	2271	51 ms	583 ms	0	0

~10× p99 reduction with equivalent throughput. The lazy-refill p99 is dominated by the first ~64 requests per worker piling up behind cold-start PG.connects. With prefill, the cost is paid during fork(2); the worker returns to the accept loop with a warm pool. Trade-off: fork itself takes longer — boot-to-first-request goes from ~9 s to ~12 s on this WAN setup (each child opens 64 connections in parallel before returning from Process._fork). Production deploys absorb that once; the per-request distribution is what your operators see all day.

Or compose explicitly:

require 'hyperion/async_pg'
require 'hyperion/async_pg/fork_safe'
Hyperion::AsyncPg.install!
Hyperion::AsyncPg::ForkSafe.install!

Kitchen-sink one-liner (everything on, AR + ForkSafe):

Hyperion::AsyncPg.install!(activerecord: true, fork_safe: true)

Both activerecord: and fork_safe: default to false — existing 0.3.0 callers see no behaviour change. prefill_in_child: defaults to false on ForkSafe.register so 0.4.0 callers see no behaviour change either.

ForkSafe cooperates with other libraries that hook Process._fork (e.g. Async::ForkHandler) by prepending its handler module onto Process.singleton_class. Method dispatch chains through every prepend in order — no captured-method recursion.

When NOT to use ForkSafe

Ruby < 3.1: Process._fork doesn't exist. ForkSafe.install! warns and no-ops; fall back to on_worker_boot.
Single-worker servers: ForkSafe.install! is a no-op penalty (one extra method dispatch on Process._fork, which only fires on Process.fork calls). Harmless, but you don't need it.
You already have a working on_worker_boot config: keep it — ForkSafe is for users who don't want a separate config file.

When NOT to use `prefill_in_child: true`

Tight dev-loop iteration on a slow PG link: each fork waits for pool_size parallel PG.connects before the worker comes online. Add 2-3 s × per-worker on a 100 ms-RTT WAN.
PG instance can't accommodate pool_size × worker_count simultaneous opens during a deploy: the parallel-connect storm at fork time can transiently saturate max_connections. Stagger workers (Hyperion's boot_workers) or shrink pool_size.
Pool's #fill has side effects you don't want firing N times at fork (e.g. running migrations on first connect). Only the FiberPool factory you control decides what #fill does; pass prefill_in_child: false if any side effect is per-pool, not per-worker.

Caveats

Only yields under a fiber scheduler. Outside Async { ... } (Sidekiq workers, plain scripts, rake tasks, Rails console) the patched methods behave identically to plain pg — IO#wait_readable falls back to its blocking implementation when Fiber.scheduler is nil. There is no perf regression in non-async contexts.
Long-running statements still block the calling fiber. The shim parks a fiber on the socket; it does not preempt the running query. A 10 s SELECT still ties up that fiber for 10 s. Cap runaway queries with Postgres statement_timeout (or session-level SET statement_timeout), not at the Ruby layer.
Connection pool sizing. Under Hyperion + this shim, fibers vastly outnumber threads — each fiber can hold a checked-out DB connection while it waits on Postgres. A worker with 10 OS threads and 200 concurrent fibers can hold 200 in-flight connections. Size your pool: (ActiveRecord) or :max_connections (Sequel) and your Postgres max_connections accordingly. Rule of thumb: pool >= peak concurrent fibers per worker.
Single-statement only. The shim drains all results and returns the last one, matching pg's default exec_params semantics. Multi-statement strings sent through exec produce the last result, as before.

Tuning

Env var	Default	Meaning
`HYPERION_ASYNC_PG_READ_TIMEOUT`	unset (block forever)	Seconds passed to `IO#wait_readable` per poll. Unset matches pg's default — rely on Postgres `statement_timeout` for the upper bound. Set when you want a hard ceiling on a single socket-wait independent of server-side timeouts; on timeout the shim raises `PG::ConnectionBad`.

Read at every dispatch; no restart required.

Troubleshooting

macOS: `objc[NNN]: +[NSCharacterSet initialize] may have been in progress in another thread when fork() was called`

Falcon (and any pre-fork server) on macOS crashes if you PG.connect BEFORE the fork — pg's libpq pulls in Foundation/objc, which is not fork-safe under recent macOS. Symptom: child workers SIGABRT immediately on boot, often with the message above.

Workaround:

export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
falcon serve --bind http://127.0.0.1:9292 --count 4 -c config.ru

Linux is unaffected. The cleaner fix is to defer pool creation to on_worker_boot (post-fork) — see the multi-worker section below — so no PG sockets exist in the parent at fork time.

Verified bench

Ubuntu 24.04 / 16 vCPU / Ruby 3.3.3, Postgres 17 over a WAN link, wrk -t4 -c200 -d20s. All configs single-worker (-w 1) unless noted; all returned 0 non-2xx and 0 wrk timeouts. RSS sampled mid-run via ps -o rss.

Two workloads:

wait (bench/pg_concurrent.ru): SELECT pg_sleep(0.05) + tiny JSON. Pure wait-bound — the "best case" for this gem.
mix (bench/pg_mixed.ru): same query + a 50-key JSON serialization (~5 ms CPU). The honest case — what real Rails apps look like.

Wait-bound workload

Setup	r/s	p99	RSS	vs Puma `-t 5`
Puma 8.0 `-t 5` pool=5	56.5	3.88 s	87 MB	1.0×
Puma 8.0 `-t 30` pool=30	402.1	880 ms	99 MB	7.1×
Puma 8.0 `-t 100` pool=100	1067.4	557 ms	121 MB	18.9×
Hyperion `--async-io -t 5` pool=5	56.7	3.82 s	108 MB	1.0×
Hyperion `--async-io -t 5` pool=32	400.4	878 ms	123 MB	7.1×
Hyperion `--async-io -t 5` pool=64	778.9	638 ms	133 MB	13.8×
Hyperion `--async-io -t 5` pool=128	1344.2	536 ms	148 MB	23.8×
Hyperion `--async-io -t 5` pool=200	2381.4	471 ms	164 MB	42.2×
Hyperion `--async-io -w 4 -t 5` pool=32	1443.0	2.69 s	503 MB	25.5× (cold-start p99 — see note)
Hyperion `--async-io -w 4 -t 5` pool=64	1937.5	4.84 s	416 MB	34.3× (cold-start p99 — see note)
Falcon 0.55.3 `--count 1` pool=128	1665.7	516 ms	141 MB	29.5×

Mixed CPU+wait workload (50 ms PG + 50-key JSON serialization)

Setup	r/s	p99	RSS	vs Puma `-t 30`
Puma 8.0 `-t 30` pool=30	351.7	963 ms	127 MB	1.0×
Hyperion `--async-io -t 5` pool=32	371.2	919 ms	151 MB	1.05×
Hyperion `--async-io -t 5` pool=64	741.5	681 ms	161 MB	2.1×
Hyperion `--async-io -t 5` pool=128	1739.9	512 ms	201 MB	4.9×
Hyperion `--async-io -w 4 -t 5` pool=32	1303.0	2.99 s	675 MB	3.7× (cold-start p99)
Falcon 0.55.3 `--count 1` pool=128	1642.1	531 ms	213 MB	4.7×

Mixed throughput slightly EXCEEDS pure-wait at high pools — pool=128 mixed (1740 r/s) beats wait-only (1344 r/s). The JSON CPU work overlaps the PG-wait windows of other fibers; a longer per-request lifetime lets the scheduler pack more in-flight requests. Counter-intuitive, real, reproducible.

What the RSS column actually tells us

Linux thread stacks are demand-paged, so Puma's worst-case "100 threads × 8 MB virtual" doesn't surface as 800 MB RSS — only ~30-40 MB of actual paged memory. At single-worker pool sizes ≤ 200, PG connection buffers (~600 KB per conn) dominate RSS, not thread stacks. So the headline difference between Hyperion and Puma is throughput + tail latency, not RSS, at this pool scale.

Where the architectural memory story DOES land:

Connection count: Puma is capped at max_threads concurrent in-flight queries. Hyperion --async-io is capped at pool_size. To match Hyperion-pool-200, Puma needs -t 200 — 200 OS threads pin the address space and increase context-switch overhead, even if RSS doesn't blow up.
Idle keep-alive connections: each Puma idle keep-alive holds an OS thread. Hyperion holds a ~1 KB fiber. At 10k idle clients, this is the dramatic difference — but that's a different bench (see Hyperion's 10k-connection bench), not this one.

`-w 4` cold-start caveat (and how 0.5.0 fixes it)

The -w 4 rows in the table above (2.69-4.84 s p99) come from 0.4.0's default lazy-refill: each child worker pays the full pool-fill cost on its first ~pool_size requests after fork. With pool=64 × 4 workers × ~100 ms PG.connect over WAN = ~25 s of cold-start work spread across the first batch of requests per worker. r/s is fine over the 20 s window, p99 absorbs the spike.

Fixed in 0.5.0 via ForkSafe.register(pool, prefill_in_child: true) — child re-opens the pool synchronously inside the fork hook. Side-by-side on the same hardware:

0.5.0 setup (`-w 4 -t 5` pool=64)	r/s	p99	timeouts
Lazy refill (`prefill_in_child: false`)	2089	6.02 s	0
Prefill-in-child (`prefill_in_child: true`)	2271	583 ms	0

See "Multi-worker (-w N): use ForkSafe" above. As of 0.5.0 the rackup at bench/pg_concurrent.ru defaults to prefill_in_child: true, so reproducing the table will get you the prefill numbers.

Reproduce

gem install hyperion-rb hyperion-async-pg pg
git clone https://github.com/andrew-woblavobla/hyperion-async-pg && cd hyperion-async-pg
bundle install

# Wait-bound (pure pg_sleep)
DATABASE_URL='postgres://user:pass@host/db' MODE=async PG_POOL_SIZE=200 \
  bundle exec hyperion --async-io -t 5 -w 1 -p 9292 bench/pg_concurrent.ru &
wrk -t4 -c200 -d20s --latency http://127.0.0.1:9292/

# Mixed (pg_sleep + 50-key JSON serialization)
DATABASE_URL='...' MODE=async PG_POOL_SIZE=128 \
  bundle exec hyperion --async-io -t 5 -w 1 -p 9293 bench/pg_mixed.ru &
wrk -t4 -c200 -d20s --latency http://127.0.0.1:9293/

Postgres max_connections: pool=200 needs at least 200 free connections in your PG instance. Default is 100 — bump with ALTER SYSTEM SET max_connections = 500; + Postgres restart, or scale the pool to fit.

Multi-worker (-w N): PG sockets opened in the master are shared across forked workers; multiple worker pids reading the same kernel socket buffer interleave bytes and corrupt the wire protocol (this is why a top-level-init bench returns 99.99% 500s under -w 4). Initialize the pool inside each child — either via lazy first-request init (what bench/pg_concurrent.ru does) or via Hyperion's on_worker_boot hook in a config file:
# hyperion.rb — passed via `bundle exec hyperion -C hyperion.rb ...`
on_worker_boot do
  require 'hyperion/async_pg'
  Hyperion::AsyncPg.install!
  $pg_pool = Hyperion::AsyncPg::FiberPool.new(size: 64) do
    PG.connect(ENV['DATABASE_URL'])
  end.fill
end
on_worker_boot pre-warms the pool before the worker accepts its first request (no first-request latency spike). Lazy init is cheaper to set up and acceptable for benchmarks where the first-request cost is amortized over a 20s+ run.

Caveats

The win evaporates if any of these is wrong:

Server doesn't run requests under Async::Scheduler (use Hyperion --async-io, Falcon, or Hyperion-over-TLS).
Connection pool isn't fiber-aware (connection_pool gem blocks the OS thread).
Workload isn't actually wait-bound (CPU-heavy handlers don't benefit; the gain is exactly the PG round-trip you can stack).

How it works

PG::Connection#exec_params(...) (and the other patched methods) becomes:

Call the non-blocking send_query_params(...) C function — fires the query off, returns immediately.
Loop: consume_input → check is_busy → if busy, socket_io.wait_readable. Under Async::Scheduler, wait_readable yields the fiber. Without one, it blocks the OS thread.
Drain results with get_result, return the final one (after result.check to surface errors).

No threads, no extra IO objects, no copy of the result through Ruby. The C extension does all the work; we only swap the wait primitive.

License

MIT.

hyperion-async-pg

Install

Compatibility

ActiveRecord: opt in with activerecord: true

Server support matrix

Connection pool — use a fiber-aware one

Alternative: async-pool

Other options

Multi-worker (-w N): use ForkSafe (0.4.0+)

Verified bench: prefill eliminates the cold-start p99 spike

When NOT to use ForkSafe

When NOT to use prefill_in_child: true

Caveats

Tuning

Troubleshooting

macOS: objc[NNN]: +[NSCharacterSet initialize] may have been in progress in another thread when fork() was called

Verified bench

Wait-bound workload

Mixed CPU+wait workload (50 ms PG + 50-key JSON serialization)

What the RSS column actually tells us

-w 4 cold-start caveat (and how 0.5.0 fixes it)

Reproduce

Reproduce

Caveats

How it works

License

ActiveRecord: opt in with `activerecord: true`

Alternative: `async-pool`

Multi-worker (`-w N`): use `ForkSafe` (0.4.0+)

When NOT to use `prefill_in_child: true`

macOS: `objc[NNN]: +[NSCharacterSet initialize] may have been in progress in another thread when fork() was called`

`-w 4` cold-start caveat (and how 0.5.0 fixes it)