Class: Hyperion::Metrics

Inherits:
Object
  • Object
show all
Defined in:
lib/hyperion/metrics.rb,
lib/hyperion/metrics/path_templater.rb

Overview

Lock-free per-thread counters. Each worker thread mutates its own Hash on the hot path — no mutex acquire/release on every increment, no contention across the thread pool. ‘snapshot` aggregates lazily across all threads that have ever incremented (one short mutex section, only taken when the operator asks for stats).

Storage: counters live behind ‘Thread#thread_variable_*`, which is the only TRUE thread-local in Ruby 1.9+ — `Thread.current` is in fact FIBER-local, so under an `Async::Scheduler` (TLS path, h2 streams, the 1.3.0+ `–async-io` plain HTTP/1.1 path) every handler fiber would get its own private counters Hash that `snapshot` could never find. Verified with hyperion-async-pg 0.4.0’s bench round; before the fix the dispatch counters dropped requests entirely under ‘–async-io` and an external scrape (Prometheus exporter on a different fiber than the handler) saw the dispatch buckets at zero.

Cross-fiber races on the same OS thread: the ‘+=` is technically read- modify-write, but Ruby’s fiber scheduler only preempts at IO boundaries (Fiber.scheduler-aware system calls), and ‘Hash#[]=` is purely Ruby —no preemption mid-increment, no torn writes. Two fibers cannot interleave a single `+=` on the same OS thread.

Reset semantics: counters monotonically increase. Operators that want rate-of-change should snapshot, sleep, snapshot, diff.

Public API:

Hyperion.stats -> Hash with all current values across all threads.

Defined Under Namespace

Classes: HistogramAccumulator, PathTemplater

Constant Summary collapse

REQUESTS_DISPATCH_TOTAL =

2.12-E — labeled counter family that observes which worker process a given request landed on. Ticks once per dispatched request from every dispatch shape (Connection#serve, h2 streams, the C accept4 + io_uring loops; see PrometheusExporter for the C-loop fold-in at scrape time).

‘worker_id` is conventionally `Process.pid.to_s` — matches the 2.4-C `hyperion_io_uring_workers_active` and `hyperion_per_conn_rejections_total` labeling convention; lets operators correlate distribution rows with `ps`/`/proc` data without a separate worker_id <-> pid mapping table.

Hot-path cost: one ‘@hg_mutex` acquisition per tick. That’s acceptable for the audit metric: contention shows up only on the ‘tick + render` overlap, never inside the C accept loop (which uses its own atomic counter folded in at scrape time). Worth the simplicity over an extra lock-free per-thread cache.

:hyperion_requests_dispatch_total
WORKER_ID_LABEL_KEYS =
%w[worker_id].freeze
EMPTY_LABELS =

Frozen empty Array used as the default label tuple. Reused across all label-less observations so we don’t allocate a fresh ‘[]` per scrape — keeps hot-path work allocation-free for the un-labeled gauge/histogram families.

[].freeze

Class Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeMetrics

Returns a new instance of Metrics.



49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
# File 'lib/hyperion/metrics.rb', line 49

def initialize
  # Direct list of every per-thread counters Hash ever allocated through
  # this Metrics instance. We hold the Hash refs ourselves (instead of
  # holding Thread refs and looking the Hash up via thread-local
  # storage) so snapshot survives thread death — counters from a
  # short-lived worker that already exited still aggregate. Tiny per-
  # thread footprint (one Hash + one slot in this Array).
  @thread_counters = []
  @counters_mutex = Mutex.new
  # Per-instance thread-local key so spec runs that build fresh Metrics
  # objects don't share state across examples.
  @thread_key = :"__hyperion_metrics_#{object_id}__"

  # 2.4-C — observability enrichment. Histograms and gauges live as
  # separate keyed structures (vs counters) because the wire format
  # is different (per-bucket cumulative counts + sum/count for
  # histograms; a single instantaneous reading for gauges). Both are
  # mutex-guarded — these are scrape-rate operations (one observe per
  # request, one set per worker boot/shutdown), not per-syscall.
  #
  # Histograms: `{ name => { labels_tuple_array => HistogramAccumulator } }`.
  # Gauges:     `{ name => { labels_tuple_array => Float } }`.
  # `labels_tuple_array` is a frozen Array<String> of label values
  # (stable order, supplied by the observer); it doubles as the Hash
  # key for cheap O(1) lookup.
  @histograms      = {}
  @histograms_meta = {} # name => { buckets:, label_keys: }
  @gauges          = {}
  @gauges_meta     = {} # name => { label_keys: }
  @hg_mutex        = Mutex.new
  # Snapshot block hooks for gauges whose value is read on demand
  # (ThreadPool queue depth, etc.). `{ name => { labels_tuple => Proc } }`.
  @gauge_blocks    = {}
end

Class Attribute Details

.default_path_templaterObject



40
41
42
# File 'lib/hyperion/metrics.rb', line 40

def default_path_templater
  @default_path_templater ||= PathTemplater.new
end

Class Method Details

.reset_default_path_templater!Object



44
45
46
# File 'lib/hyperion/metrics.rb', line 44

def reset_default_path_templater!
  @default_path_templater = nil
end

Instance Method Details

#decrement(key, by = 1) ⇒ Object



108
109
110
# File 'lib/hyperion/metrics.rb', line 108

def decrement(key, by = 1)
  increment(key, -by)
end

#decrement_gauge(name, label_values = EMPTY_LABELS, delta = 1) ⇒ Object



249
250
251
# File 'lib/hyperion/metrics.rb', line 249

def decrement_gauge(name, label_values = EMPTY_LABELS, delta = 1)
  increment_gauge(name, label_values, -delta)
end

#ensure_worker_request_family_registered!Object

2.12-E — Idempotently register the labeled-counter family. Public so ‘Server#run_c_accept_loop` can register at boot — the PrometheusExporter’s C-loop fold-in is gated on the family being in the snapshot, and a 100% C-loop worker never goes through ‘tick_worker_request` to register lazily.



147
148
149
150
151
152
# File 'lib/hyperion/metrics.rb', line 147

def ensure_worker_request_family_registered!
  return if @worker_request_family_registered

  register_labeled_counter(REQUESTS_DISPATCH_TOTAL, label_keys: WORKER_ID_LABEL_KEYS)
  @worker_request_family_registered = true
end

#gauge_meta(name) ⇒ Object



260
261
262
# File 'lib/hyperion/metrics.rb', line 260

def gauge_meta(name)
  @hg_mutex.synchronize { @gauges_meta[name]&.dup }
end

#gauge_snapshotObject



278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
# File 'lib/hyperion/metrics.rb', line 278

def gauge_snapshot
  out = {}
  @hg_mutex.synchronize do
    names = (@gauges.keys + @gauge_blocks.keys).uniq
    names.each do |name|
      per_labels = {}
      @gauges[name]&.each { |labels, value| per_labels[labels] = value.to_f }
      @gauge_blocks[name]&.each do |labels, block|
        # Block-evaluated gauges read live state at scrape time. We
        # release the mutex around the block call to avoid holding
        # while user code runs, BUT we currently hold @hg_mutex —
        # the contract is that the block is short and side-effect-
        # free (e.g., reads ThreadPool#queue_size). That's the only
        # use case we wire today; document if extended.
        per_labels[labels] = block.call.to_f
      rescue StandardError
        # Snapshot must never raise — a misbehaving block degrades
        # to "no reading" rather than a 500 on /-/metrics.
        next
      end
      out[name] = { meta: @gauges_meta[name] || { label_keys: [].freeze }, series: per_labels }
    end
  end
  out
end

#histogram_meta(name) ⇒ Object

Register that a histogram/gauge family exists with this label ordering. The PrometheusExporter calls ‘histogram_meta` / `gauge_meta` at scrape time to build the HELP/TYPE preamble.



256
257
258
# File 'lib/hyperion/metrics.rb', line 256

def histogram_meta(name)
  @hg_mutex.synchronize { @histograms_meta[name]&.dup }
end

#histogram_snapshotObject

Snapshot helpers — read-only views of the current histogram / gauge state. The exporter uses these to render the scrape body.



266
267
268
269
270
271
272
273
274
275
276
# File 'lib/hyperion/metrics.rb', line 266

def histogram_snapshot
  out = {}
  @hg_mutex.synchronize do
    @histograms.each do |name, family|
      per_labels = {}
      family.each { |labels, accum| per_labels[labels] = accum.snapshot }
      out[name] = { meta: @histograms_meta[name], series: per_labels }
    end
  end
  out
end

#increment(key, by = 1) ⇒ Object

Hot path: one thread-variable lookup + one hash op. No mutex on the increment fast path; the mutex is taken only on first allocation per OS thread (very rare) and on snapshot.

Storage uses Thread#thread_variable_*, which is the only TRUE thread- local in Ruby 1.9+ — Thread.current is in fact FIBER-local, so under an Async::Scheduler (TLS path, h2 streams, the 1.3.0+ –async-io plain HTTP/1.1 path) every handler fiber would get its own private counters Hash that snapshot could never aggregate. Verified with hyperion-async-pg 0.4.0’s bench round; before the fix the dispatch counters dropped requests under –async-io.

Cross-fiber races on the same OS thread: the ‘+=` is read-modify-write, but Ruby’s fiber scheduler only preempts at IO boundaries (Fiber- scheduler-aware system calls). Hash#[]= is purely Ruby — no preemption mid-increment, no torn writes. Two fibers cannot interleave a single ‘+=` on the same OS thread.



101
102
103
104
105
106
# File 'lib/hyperion/metrics.rb', line 101

def increment(key, by = 1)
  thread = Thread.current
  counters = thread.thread_variable_get(@thread_key)
  counters = register_thread_counters(thread) if counters.nil?
  counters[key] += by
end

#increment_gauge(name, label_values = EMPTY_LABELS, delta = 1) ⇒ Object

Increment a gauge by ‘delta` (default 1). Used for kTLS active connections, etc. — paired with `decrement_gauge` on close.



240
241
242
243
244
245
246
247
# File 'lib/hyperion/metrics.rb', line 240

def increment_gauge(name, label_values = EMPTY_LABELS, delta = 1)
  @hg_mutex.synchronize do
    @gauges_meta[name] ||= { label_keys: [].freeze }
    family = (@gauges[name] ||= {})
    key    = label_values.frozen? ? label_values : label_values.dup.freeze
    family[key] = (family[key] || 0.0) + delta.to_f
  end
end

#increment_labeled_counter(name, label_values = EMPTY_LABELS, by = 1) ⇒ Object

Labeled counter — separate from the legacy thread-local counter surface (which is unlabeled and per-thread for hot-path contention-free increments). Labeled counters take a mutex per increment, but they’re called from low-rate paths (per-conn rejection ~ kHz worst case, vs M+req/s on the unlabeled side) so the contention cost is invisible.



316
317
318
319
320
321
322
323
324
325
# File 'lib/hyperion/metrics.rb', line 316

def increment_labeled_counter(name, label_values = EMPTY_LABELS, by = 1)
  @hg_mutex.synchronize do
    @labeled_counters_meta ||= {}
    @labeled_counters_meta[name] ||= { label_keys: [].freeze }
    @labeled_counters ||= {}
    family = (@labeled_counters[name] ||= {})
    key    = label_values.frozen? ? label_values : label_values.dup.freeze
    family[key] = (family[key] || 0) + by
  end
end

#increment_status(code) ⇒ Object



112
113
114
# File 'lib/hyperion/metrics.rb', line 112

def increment_status(code)
  increment(:"responses_#{code}")
end

#labeled_counter_snapshotObject



336
337
338
339
340
341
342
343
344
345
346
347
# File 'lib/hyperion/metrics.rb', line 336

def labeled_counter_snapshot
  out = {}
  @hg_mutex.synchronize do
    (@labeled_counters || {}).each do |name, family|
      per_labels = {}
      family.each { |labels, count| per_labels[labels] = count }
      meta = (@labeled_counters_meta || {})[name] || { label_keys: [].freeze }
      out[name] = { meta: meta, series: per_labels }
    end
  end
  out
end

#observe_histogram(name, value, label_values = EMPTY_LABELS) ⇒ Object

Observe ‘value` on a previously-registered histogram. `label_values` MUST be supplied in the same order as `label_keys` at registration. The hot path: one Hash lookup, one accumulator update under a mutex. Allocation footprint per observe: zero on the cached-key path (same labels seen before); one frozen Array on first observation for a given label-set.



204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
# File 'lib/hyperion/metrics.rb', line 204

def observe_histogram(name, value, label_values = EMPTY_LABELS)
  @hg_mutex.synchronize do
    meta = @histograms_meta[name]
    return unless meta # silently skip unregistered observations

    family = @histograms[name]
    accum  = family[label_values]
    unless accum
      accum = HistogramAccumulator.new(meta[:buckets])
      # Freeze the label tuple so future identical-content tuples
      # hash to the same bucket — but we keep the original ref
      # provided by the caller as the canonical key so subsequent
      # observes with the same Array bypass the freeze step.
      family[label_values.frozen? ? label_values : label_values.dup.freeze] = accum
    end
    accum.observe(value)
  end
end

#register_histogram(name, buckets:, label_keys: []) ⇒ Object

Register a histogram family. Idempotent — re-registering with the same buckets/label_keys is a no-op; mismatched re-register raises so a typo surfaces at boot rather than corrupting the scrape output.



181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
# File 'lib/hyperion/metrics.rb', line 181

def register_histogram(name, buckets:, label_keys: [])
  @hg_mutex.synchronize do
    if (existing = @histograms_meta[name])
      unless existing[:buckets] == buckets && existing[:label_keys] == label_keys
        raise ArgumentError,
              "histogram #{name.inspect} re-registered with different shape " \
              "(was buckets=#{existing[:buckets]} labels=#{existing[:label_keys]}; " \
              "now buckets=#{buckets} labels=#{label_keys})"
      end

      return
    end
    @histograms_meta[name] = { buckets: buckets.dup.freeze, label_keys: label_keys.dup.freeze }
    @histograms[name]      = {}
  end
end

#register_labeled_counter(name, label_keys: []) ⇒ Object



327
328
329
330
331
332
333
334
# File 'lib/hyperion/metrics.rb', line 327

def register_labeled_counter(name, label_keys: [])
  @hg_mutex.synchronize do
    @labeled_counters_meta ||= {}
    @labeled_counters_meta[name] = { label_keys: label_keys.dup.freeze }
    @labeled_counters ||= {}
    @labeled_counters[name] ||= {}
  end
end

#reset!Object

Tests can call .reset! between examples to avoid cross-spec leakage.



165
166
167
168
169
170
171
172
173
174
# File 'lib/hyperion/metrics.rb', line 165

def reset!
  @counters_mutex.synchronize do
    @thread_counters.each(&:clear)
  end
  @hg_mutex.synchronize do
    @histograms.each_value(&:clear)
    @gauges.each_value(&:clear)
    @gauge_blocks.each_value(&:clear)
  end
end

#set_gauge(name, value = nil, label_values = EMPTY_LABELS, &block) ⇒ Object

Set a gauge value. ‘label_values` follows the same convention as `observe_histogram`. Pass a block to register a callback that’s evaluated lazily at snapshot time (ThreadPool queue depth, etc.) —the callback’s return value is the gauge’s current reading.



227
228
229
230
231
232
233
234
235
236
# File 'lib/hyperion/metrics.rb', line 227

def set_gauge(name, value = nil, label_values = EMPTY_LABELS, &block)
  @hg_mutex.synchronize do
    @gauges_meta[name] ||= { label_keys: [].freeze }
    if block
      (@gauge_blocks[name] ||= {})[label_values.frozen? ? label_values : label_values.dup.freeze] = block
    else
      (@gauges[name] ||= {})[label_values.frozen? ? label_values : label_values.dup.freeze] = value.to_f
    end
  end
end

#snapshotObject



154
155
156
157
158
159
160
161
162
# File 'lib/hyperion/metrics.rb', line 154

def snapshot
  result = Hash.new(0)
  counters_snapshot = @counters_mutex.synchronize { @thread_counters.dup }
  counters_snapshot.each do |counters|
    counters.each { |k, v| result[k] += v }
  end
  result.default = nil
  result
end

#tick_worker_request(worker_id) ⇒ Object



136
137
138
139
140
# File 'lib/hyperion/metrics.rb', line 136

def tick_worker_request(worker_id)
  label = worker_id.nil? || worker_id.to_s.empty? ? '0' : worker_id.to_s
  ensure_worker_request_family_registered!
  increment_labeled_counter(REQUESTS_DISPATCH_TOTAL, [label])
end