Class: Hyperion::CLI

Inherits:

Object

Object
Hyperion::CLI

show all

Defined in:: lib/hyperion/cli.rb

Constant Summary collapse

DEFAULT_CONFIG_PATH =

'config/hyperion.rb'

ASYNC_IO_PROBE_LIBS = Probe table for fiber-cooperative I/O libraries. If ‘async_io: true` is set but none of these are loaded, the operator has likely flipped the flag without reading the bench numbers — `–async-io` adds Async-loop overhead and only pays off when paired with a library whose I/O calls yield to the scheduler. Hello-world bench (BENCH_2026_04_27.md) showed a 47% rps regression + 3.65 s p99 spike on this shape.

{
  'hyperion-async-pg' => -> { defined?(::Hyperion::AsyncPg) },
  'async-redis' => -> { defined?(::Async::Redis) },
  'async-http' => -> { defined?(::Async::HTTP) }
}.freeze

Class Method Summary collapse

.load_rack_app(path) ⇒ Object

Rack 3’s parse_file returns a single app value; Rack 2 returned [app, options].
.parse_argv!(argv) ⇒ Object

Extracted from #run so the flag-to-cli_opts mapping can be unit-tested without booting a server.
.run(argv) ⇒ Object
.run_cluster(config, app, workers, rackup_path: nil) ⇒ Object
.run_single(config, app) ⇒ Object
.wrap_admin_middleware(app, config) ⇒ Object

When admin_token is configured, wrap the app in AdminMiddleware so POST /-/quit and GET /-/metrics become token-protected admin endpoints.

Class Method Details

.load_rack_app(path) ⇒ `Object`

Rack 3’s parse_file returns a single app value; Rack 2 returned [app, options]. Normalize so we get just the app either way. Used by both the preload path (master parses once, before fork) and the non-preload path (each worker parses post-fork) — see Worker#run.

# File 'lib/hyperion/cli.rb', line 387

def self.load_rack_app(path)
  result = ::Rack::Builder.parse_file(path)
  result.is_a?(Array) ? result.first : result
end

.parse_argv!(argv) ⇒ `Object`

Extracted from #run so the flag-to-cli_opts mapping can be unit-tested without booting a server. Returns [cli_opts, config_path]. Mutates argv in place (consumes flags, leaves the rackup path for the caller).

# File 'lib/hyperion/cli.rb', line 138

def self.parse_argv!(argv)
  cli_opts    = {}
  config_path = nil

  parser = OptionParser.new do |o|
    o.banner = 'Usage: hyperion [options] config.ru'
    o.on('-C', '--config PATH', "Hyperion config file (default ./#{DEFAULT_CONFIG_PATH} if it exists)") do |p|
      config_path = p
    end
    o.on('-b', '--bind HOST', 'host (default 127.0.0.1)') { |h| cli_opts[:host] = h }
    o.on('-p', '--port PORT', Integer, 'port (default 9292)') { |p| cli_opts[:port] = p }
    o.on('-w', '--workers N', Integer, 'worker processes (0 = nprocessors)') { |w| cli_opts[:workers] = w }
    o.on('-t', '--threads N', Integer, 'Rack handler thread pool size (0 disables)') do |t|
      cli_opts[:thread_count] = t
    end
    o.on('--tls-cert PATH', 'TLS certificate (PEM; chained intermediates supported)') do |p|
      # Parse every BEGIN/END block in the file — production certs ship
      # as leaf+intermediate(s) bundled together. `OpenSSL::X509::Certificate.new`
      # only reads the first block, so loading via that single call would
      # silently drop the chain. See Hyperion::TLS.parse_pem_chain.
      certs = Hyperion::TLS.parse_pem_chain(File.read(p))
      abort("[hyperion] no certificates found in #{p}") if certs.empty?

      cli_opts[:tls_cert]  = certs.first
      cli_opts[:tls_chain] = certs[1..]
    end
    o.on('--tls-key PATH', 'TLS private key (PEM)') do |p|
      cli_opts[:tls_key] = OpenSSL::PKey.read(File.read(p))
    end
    o.on('--log-level LEVEL', %w[debug info warn error fatal], 'log level (default info)') do |l|
      cli_opts[:log_level] = l.to_sym
    end
    o.on('--log-format FORMAT', %w[text json auto],
         'log format: text | json | auto (default auto: json on RAILS_ENV/RACK_ENV=production, colored text on TTY, json otherwise)') do |f|
      cli_opts[:log_format] = f.to_sym
    end
    o.on('--[no-]log-requests',
         'Per-request access log line (default ON; pass --no-log-requests to disable).') do |v|
      cli_opts[:log_requests] = v
    end
    o.on('--fiber-local-shim', 'Patch Thread.current[] to be fiber-local (Rails-compat for older gems)') do
      cli_opts[:fiber_local_shim] = true
    end
    o.on('--[no-]yjit',
         'Enable Ruby YJIT (default: auto on RAILS_ENV/RACK_ENV=production/staging)') do |v|
      cli_opts[:yjit] = v
    end
    o.on('--[no-]async-io',
         'Run plain HTTP/1.1 connections under Async::Scheduler (required for hyperion-async-pg and other fiber-cooperative I/O; default off)') do |v|
      cli_opts[:async_io] = v
    end
    o.on('--max-body-bytes BYTES', Integer,
         'Maximum request body size in bytes (default 16777216 = 16 MiB)') do |n|
      cli_opts[:max_body_bytes] = n
    end
    o.on('--max-header-bytes BYTES', Integer,
         'Maximum total request-header size in bytes (default 65536 = 64 KiB)') do |n|
      cli_opts[:max_header_bytes] = n
    end
    o.on('--max-pending COUNT', Integer,
         'Maximum queued connections per worker before new accepts are rejected with 503 (default unbounded)') do |n|
      cli_opts[:max_pending] = n
    end
    o.on('--max-request-read-seconds SECONDS', Float,
         'Total wallclock budget for reading request line + headers + body (default 60.0; 0 disables)') do |n|
      cli_opts[:max_request_read_seconds] = n
    end
    # Security-sensitive: read the token verbatim and never echo it back
    # in any subsequent log/help line. argv is visible via `ps` on most
    # systems; production deployments should prefer --admin-token-file.
    o.on('--admin-token TOKEN',
         "Bearer token for the /-/quit and /-/metrics admin endpoints. \
WARNING: argv is visible via `ps`; prefer --admin-token-file PATH for production.") do |t|
      cli_opts[:admin_token] = t
    end
    o.on('--admin-token-file PATH',
         'Read the admin token from a file. File must NOT be world-readable (perms must mask 0o007).') do |p|
      cli_opts[:admin_token] = read_admin_token_file(p)
    end
    o.on('--worker-max-rss-mb MB', Integer,
         'Recycle a worker when its RSS exceeds MB megabytes (default unset; nil disables)') do |n|
      cli_opts[:worker_max_rss_mb] = n
    end
    o.on('--idle-keepalive SECONDS', Float,
         'Idle keep-alive timeout in seconds (default 5.0)') do |n|
      cli_opts[:idle_keepalive] = n
    end
    o.on('--graceful-timeout SECONDS', Integer,
         'Graceful shutdown deadline in seconds before SIGKILL (default 30)') do |n|
      cli_opts[:graceful_timeout] = n
    end
    # 2.2.x fix-D: expose the existing `h2.max_total_streams` admission
    # cap (1.7.0+ DSL knob) at the CLI surface. The 2.0.0 default flip
    # to `max_concurrent_streams × workers × 4` (= 512 streams per
    # process at -w 1) is sized for normal browser traffic but cuts
    # off h2load benches and gRPC/long-fan-out workloads mid-test —
    # this flag lets operators raise or disable the cap without
    # writing a config file. `unbounded` (or `:unbounded`) writes
    # `nil` to Config, which restores the pre-2.0 unbounded behaviour.
    o.on('--h2-max-total-streams VALUE',
         'HTTP/2 per-connection total stream cap. Use `unbounded` to disable. ' \
         'Default: max_concurrent_streams × workers × 4 (2.0.0 flip).') do |v|
      cli_opts[:h2_max_total_streams] = parse_h2_max_total_streams!(v)
    end
    # 2.3-B: per-connection fairness cap. Defends against a greedy
    # upstream connection (nginx pipelining many client requests
    # through one keep-alive conn) hogging the worker thread pool.
    # Recommended setting: thread_count / 4 (e.g., `4` for `-t 16`).
    # `auto` resolves at finalize! to thread_count/4 (floor 1).
    # Default unset (no cap) — opt-in operator hardening.
    o.on('--max-in-flight-per-conn VALUE',
         'Per-connection in-flight request cap. Integer >= 1, or `auto` ' \
         '(thread_count/4, floor 1). Default: unset (no cap).') do |v|
      cli_opts[:max_in_flight_per_conn] = parse_max_in_flight_per_conn!(v)
    end
    # 2.3-B: TLS handshake CPU throttle. Token-bucket budget for
    # SSL_accept calls per second per worker. Defends direct-exposure
    # operators against handshake storms; for nginx-fronted topologies
    # this is mostly defensive (nginx keeps long-lived upstream conns).
    # `unlimited` (default) preserves 2.2.0 behaviour.
    o.on('--tls-handshake-rate-limit VALUE',
         'TLS handshake CPU throttle: handshakes/sec/worker. Integer >= 1 ' \
         'or `unlimited` (default).') do |v|
      cli_opts[:tls_handshake_rate_limit] = parse_tls_handshake_rate_limit!(v)
    end
    # 2.10-E: repeatable preload-at-boot flag. Each occurrence appends
    # to the cli_opts Array; merge_cli! turns each into a
    # `{path:, immutable: true}` entry on `Config#preload_static_dirs`.
    # `--no-preload-static` is the sibling sentinel that disables the
    # Rails-aware auto-detect path; the operator's explicit dirs (if
    # any) still take effect.
    o.on('--preload-static DIR',
         'Preload static assets from DIR at boot (repeatable). Marks every ' \
         'cached entry immutable so subsequent serves never re-stat.') do |dir|
      (cli_opts[:preload_static] ||= []) << dir
    end
    o.on('--no-preload-static',
         'Disable the Rails-aware static-asset auto-detect at boot. ' \
         'Explicit `--preload-static` dirs still take effect.') do
      cli_opts[:auto_preload_static_disabled] = true
    end
    # 2.16 — app preload toggle.
    o.on('--[no-]preload',
         'Preload the Rack app in the master before fork (default ON). ' \
         '--no-preload makes each worker parse config.ru post-fork; ' \
         'needed on macOS when native gems loaded in the master ' \
         '(anything that touches Network.framework via XPC) ' \
         'deadlock getaddrinfo in workers post-fork.') do |v|
      cli_opts[:preload] = v
    end
    o.on('-h', '--help', 'show help') do
      puts o
      exit 0
    end
  end
  parser.parse!(argv)

  [cli_opts, config_path]
end

.run(argv) ⇒ `Object`

# File 'lib/hyperion/cli.rb', line 13

def self.run(argv)
  cli_opts, config_path = parse_argv!(argv)

  # Precedence: CLI > config file > built-in default. We auto-load
  # config/hyperion.rb if present so operators can drop a file in their
  # repo and have it take effect without having to remember -C.
  config_path ||= DEFAULT_CONFIG_PATH if File.exist?(DEFAULT_CONFIG_PATH)
  config = config_path ? Hyperion::Config.load(config_path) : Hyperion::Config.new
  config.merge_cli!(cli_opts)

  # 2.2.x fix-C: env-var override for the kTLS knob so operators can
  # A/B kernel-TLS vs userspace SSL_write without rewriting their
  # config file. Useful for the large-payload TLS bench harness
  # (`bench/tls_static_1m.ru`, `bench/tls_json_50k.ru`).
  apply_ktls_env_override!(config)

  # 2.2.x fix-D: env-var override for the `h2.max_total_streams`
  # admission cap. Mirrors `HYPERION_TLS_KTLS` from fix-C — operators
  # running h2load or long-fan-out workloads can lift the 2.0.0
  # default (`max_concurrent_streams × workers × 4`) without
  # rewriting a config file. `HYPERION_H2_MAX_TOTAL_STREAMS=unbounded`
  # restores pre-2.0 behaviour. Applied AFTER `merge_cli!` so it
  # takes precedence over the CLI flag too — the env var is the
  # outermost knob (CI/bench harness), the flag is the inner knob
  # (per-invocation), and the config file is innermost.
  apply_h2_max_total_streams_env_override!(config)

  # 2.3-A: env-var override for the io_uring accept policy. Same
  # grammar as `HYPERION_TLS_KTLS` (off/on/auto). Operators flip
  # on for an A/B run without rewriting their config file.
  # 2.3.0 default is :off because io_uring under fork+threads has
  # known sharp edges (SQ inheritance, SQPOLL non-survival across
  # fork). The env var is the sanctioned way to opt in.
  apply_io_uring_env_override!(config)

  # 2.3-B: env-var overrides for the per-conn fairness cap and the
  # TLS handshake CPU throttle. Same precedence rule as the other
  # 2.x env-var bridges — outermost knob (env > CLI > config file).
  apply_max_in_flight_per_conn_env_override!(config)
  apply_tls_handshake_rate_limit_env_override!(config)

  # Install logger early so every subsequent log call honours the operator's
  # chosen format/level (config file or CLI) before anything else logs.
  # 1.8.0: write directly to the default Runtime — `Hyperion.logger=` now
  # emits a deprecation warn aimed at out-of-tree callers, and CLI bootstrap
  # is the canonical in-tree caller, so we sidestep the warn here.
  if config.logging.level || config.logging.format
    Hyperion::Runtime.default.logger =
      Hyperion::Logger.new(level: config.logging.level, format: config.logging.format)
  end

  # Advisory: operators frequently flip --async-io expecting "fast mode"
  # without installing a fiber-cooperative I/O library. On hello-world this
  # costs ~5% rps; on no-I/O workloads more. The flag only pays off when
  # paired with `hyperion-async-pg` / `async-redis` / `async-http`. We log
  # once at boot pointing at the operator-guidance docs; the operator's
  # setting is still honoured.
  warn_orphan_async_io(config)
  # 1.7.0 (RFC A9): hard validation of `async_io: true` (and a soft
  # warn for `false` with a fiber lib loaded). The nil-default keeps
  # the 1.6.1 advisory shape — see Hyperion.validate_async_io_loaded_libs!.
  Hyperion.validate_async_io_loaded_libs!(config.async_io)

  # Propagate log_requests so every Connection picks it up via
  # `Hyperion.log_requests?` without needing to thread it through
  # Server/ThreadPool/Master plumbing. Default is ON; nil means "don't
  # touch — fall through to the env/default chain in Hyperion.log_requests?".
  Hyperion.log_requests = config.logging.requests unless config.logging.requests.nil?

  # Enable YJIT before workers fork / connections start. Auto-on in
  # production/staging gives operators the perf bump for free; explicit
  # config.yjit (true/false) overrides the env-based default.
  maybe_enable_yjit(config)

  rackup = argv.first || 'config.ru'
  abort("[hyperion] no such rackup file: #{rackup}") unless File.exist?(rackup)

  if config.fiber_local_shim
    # Gate on async_io: with no fibers in play the shim has no purpose
    # and patching `thread_variable_*` would re-stage the 1.4.x bug
    # (stranded Logger/Metrics counters across thread-pool jobs running
    # in distinct fibers). FiberLocal.install! itself enforces this and
    # warns when ignored — we mirror the gate here for the success log.
    Hyperion::FiberLocal.install!(async_io: config.async_io == true)
    Hyperion.logger.info { { message: 'FiberLocal shim installed' } } if Hyperion::FiberLocal.installed?
  end

  workers = config.workers.zero? ? Etc.nprocessors : config.workers

  # 2.0 default flip (RFC A7): resolve the `h2.max_total_streams`
  # auto-sentinel now that worker count is known. After finalize!
  # the field always carries either a positive integer (cap) or nil
  # (operator-requested unbounded).
  config.finalize!(workers: workers)

  # 2.16 — preload toggle. In preload mode (default) the master
  # parses config.ru once and workers inherit the loaded app via
  # copy-on-write. In non-preload mode the master never touches
  # the app; each worker parses post-fork. The non-preload path
  # is the documented escape hatch for macOS getaddrinfo+fork
  # deadlocks; it costs CoW (each worker pays the full boot RSS).
  preload = config.preload != false
  if preload
    app = wrap_admin_middleware(load_rack_app(rackup), config)
  else
    app = nil
    Hyperion.logger.info do
      { message: 'preload disabled; each worker will parse rackup after fork',
        rackup: File.expand_path(rackup) }
    end
  end

  if workers <= 1
    # Single-mode always preloads — there's no fork to protect from
    # global state poisoning, so deferring the parse buys nothing.
    app ||= wrap_admin_middleware(load_rack_app(rackup), config)
    run_single(config, app)
  else
    run_cluster(config, app, workers, rackup_path: preload ? nil : File.expand_path(rackup))
  end
end

.run_cluster(config, app, workers, rackup_path: nil) ⇒ `Object`

# File 'lib/hyperion/cli.rb', line 375

def self.run_cluster(config, app, workers, rackup_path: nil)
  tls = build_tls_from_config(config)
  Master.new(host: config.host, port: config.port, app: app,
             workers: workers, tls: tls, thread_count: config.thread_count,
             read_timeout: config.read_timeout, config: config,
             rackup_path: rackup_path).run
end

.run_single(config, app) ⇒ `Object`

# File 'lib/hyperion/cli.rb', line 298

def self.run_single(config, app)
  # Single-mode: there's no fork, but AdminMiddleware still resolves the
  # signal target via Hyperion.master_pid. Set it to ourselves so
  # POST /-/quit signals the lone process — same contract as cluster
  # mode (SIGTERM the master). See Hyperion.master_pid for why we don't
  # rely on Process.pid alone (the AdminMiddleware reader's fallback
  # would do that anyway, but making it explicit + writing
  # HYPERION_MASTER_PID into ENV keeps single/cluster behaviour
  # symmetric for any external tooling that introspects the var).
  Hyperion.master_pid!(Process.pid)
  tls = build_tls_from_config(config)
  server = Server.new(host: config.host, port: config.port, app: app,
                      tls: tls, thread_count: config.thread_count,
                      read_timeout: config.read_timeout,
                      max_pending: config.max_pending,
                      max_request_read_seconds: config.max_request_read_seconds,
                      h2_settings: Master.build_h2_settings(config),
                      async_io: config.async_io,
                      accept_fibers_per_worker: config.accept_fibers_per_worker,
                      h2_max_total_streams: config.h2.max_total_streams,
                      admin_listener_port: config.admin.listener_port,
                      admin_listener_host: config.admin.listener_host,
                      admin_token: config.admin.token,
                      tls_session_cache_size: config.tls.session_cache_size,
                      tls_ktls: config.tls.ktls,
                      io_uring: config.io_uring,
                      max_in_flight_per_conn: config.max_in_flight_per_conn,
                      tls_handshake_rate_limit: config.tls.handshake_rate_limit,
                      preload_static_dirs: config.resolved_preload_static_dirs)
  warn_c_parser_unavailable

  # Pre-allocate Rack env-pool entries and eager-touch lazy constants.
  # In single-mode there's no fork, but the warmup still pays for itself
  # by frontloading the first-N-request allocation cost off the first
  # real client. Idempotent — safe to call once per process.
  Hyperion.warmup!

  # Single-worker mode reuses the lifecycle hooks: before_fork is a no-op
  # here (no fork happens), and on_worker_boot/on_worker_shutdown fire
  # for the lone in-process "worker" so app code that opens DB pools etc.
  # gets the same lifecycle whether you run 1 or N workers.
  #
  # `on_worker_boot` fires BEFORE the listener is bound — same contract
  # as the cluster path (Worker#run): the operator's boot hook runs
  # against a process with no inbound socket yet, so DB/Redis warmup
  # finishes before the kernel can queue any connections.
  config.on_worker_boot.each { |h| h.call(0) }

  server.listen
  scheme = tls ? 'https' : 'http'
  Hyperion.logger.info { { message: 'listening', url: "#{scheme}://#{server.host}:#{server.port}" } }

  shutdown_r, shutdown_w = IO.pipe
  %w[INT TERM].each do |sig|
    Signal.trap(sig) do
      shutdown_w.write_nonblock('!')
    rescue StandardError
      nil
    end
  end

  shutdown_thread = Thread.new do
    shutdown_r.read(1)
    server.stop
  end
  shutdown_thread.report_on_exception = false

  server.start
  shutdown_thread.join
  config.on_worker_shutdown.each { |h| h.call(0) }
  # Drain per-thread access buffers + sync stdio. Single-worker mode
  # doesn't go through Master#shutdown_children, so without this call
  # buffered access lines + final shutdown messages can be lost on
  # SIGTERM. See Hyperion::Logger#flush_all.
  Hyperion.logger.flush_all
end

.wrap_admin_middleware(app, config) ⇒ `Object`

When admin_token is configured, wrap the app in AdminMiddleware so POST /-/quit and GET /-/metrics become token-protected admin endpoints. Skipped when the token is unset — those paths fall through to the app, so apps may still own /-/anything if Hyperion’s admin is off.

# File 'lib/hyperion/cli.rb', line 633

def self.wrap_admin_middleware(app, config)
  return app if config.admin.token.nil? || config.admin.token.to_s.empty?

  Hyperion.logger.info do
    { message: 'admin endpoint enabled',
      paths: [AdminMiddleware::PATH_QUIT, AdminMiddleware::PATH_METRICS] }
  end
  AdminMiddleware.new(app, token: config.admin.token)
end

Class: Hyperion::CLI

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.load_rack_app(path) ⇒ Object

.parse_argv!(argv) ⇒ Object

.run(argv) ⇒ Object

.run_cluster(config, app, workers, rackup_path: nil) ⇒ Object