Class: Hyperion::H2Codec::Encoder

Inherits:
Object
  • Object
show all
Defined in:
lib/hyperion/h2_codec.rb

Overview

Ruby-friendly wrapper around the native encoder. Single instance holds an opaque pointer; ‘#encode([, …])` returns the wire bytes. The dynamic table state is per-instance.

fix-B (2.2.x) — per-encoder scratch buffers eliminate per-call FFI marshalling allocations. Each ‘Encoder` owns:

* `@scratch_out`  — output buffer reused across encode calls,
                    grown lazily if a single frame exceeds the
                    starting 16 KiB capacity.
* `@scratch_argv` — packed `(name_off, name_len, val_off, val_len)`
                    u64-quad buffer (each header is 32 bytes).
* `@scratch_blob` — concatenated header bytes
                    (name_1, value_1, name_2, value_2, …).
* `@scratch_*_ptr` — `Fiddle::Pointer`s pre-cached for the three
                     scratch strings; recreated only when the
                     underlying string is reallocated by `<<`
                     crossing the existing capacity.

‘#encode` clears the three buffers (length 0, capacity preserved), appends offset/length quads + raw bytes, and dispatches one FFI call to `hyperion_h2_codec_encoder_encode_v2`. The only unavoidable allocation per call is `byteslice` to extract the written bytes — that’s the contract ‘protocol-http2`’s ‘encode_headers` returns under, so it can’t move further.

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeEncoder

Returns a new instance of Encoder.



87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# File 'lib/hyperion/h2_codec.rb', line 87

def initialize
  raise 'H2Codec native library unavailable' unless H2Codec.available?

  @ptr = H2Codec.encoder_new
  ObjectSpace.define_finalizer(self, self.class.finalizer(@ptr))

  @scratch_out  = String.new(capacity: SCRATCH_OUT_DEFAULT,  encoding: Encoding::ASCII_8BIT)
  @scratch_argv = String.new(capacity: SCRATCH_ARGV_DEFAULT, encoding: Encoding::ASCII_8BIT)
  @scratch_blob = String.new(capacity: SCRATCH_BLOB_DEFAULT, encoding: Encoding::ASCII_8BIT)
  # Pre-cache the Fiddle::Pointer so the per-call hot path
  # doesn't pay a Pointer.new allocation. The pointer's address
  # tracks the underlying String's buffer; if `<<` later reallocates
  # the buffer we refresh the pointer and bump the recorded
  # capacity.
  @scratch_out_ptr  = Fiddle::Pointer[@scratch_out]
  @scratch_argv_ptr = Fiddle::Pointer[@scratch_argv]
  @scratch_blob_ptr = Fiddle::Pointer[@scratch_blob]
  @scratch_out_capacity  = SCRATCH_OUT_DEFAULT
  @scratch_argv_capacity = SCRATCH_ARGV_DEFAULT
  @scratch_blob_capacity = SCRATCH_BLOB_DEFAULT
  # Per-encoder Int array reused for `pack('Q*', buffer:)` calls.
  # `clear` keeps the array but length-zeros it; the underlying
  # storage capacity is retained by MRI for steady-state reuse.
  @scratch_argv_ints = []
end

Class Method Details

.finalizer(ptr) ⇒ Object



113
114
115
# File 'lib/hyperion/h2_codec.rb', line 113

def self.finalizer(ptr)
  proc { H2Codec.encoder_free(ptr) if H2Codec.available? && ptr }
end

Instance Method Details

#encode(headers) ⇒ Object



117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
# File 'lib/hyperion/h2_codec.rb', line 117

def encode(headers)
  return ''.b if headers.empty?

  # 2.4-A — fast path: when the C glue loaded successfully,
  # bypass Fiddle entirely. The C ext walks the headers array,
  # builds the argv quad buffer on the C stack, and calls
  # `hyperion_h2_codec_encoder_encode_v2` directly via a cached
  # function pointer. The only Ruby allocation per call is the
  # final `byteslice(0, written)` which copies the encoded bytes
  # into a new owned String — that's the contract callers rely
  # on (`protocol-http2`'s Compressor#encode returns a String,
  # not a slice into shared mutable memory).
  if H2Codec.cglue_available?
    # Pad the scratch String with zero bytes so its length matches
    # capacity — the C ext writes into RSTRING_PTR up to RSTRING_LEN
    # and then truncates back via rb_str_set_len after encoding.
    # The first encode pads the full SCRATCH_OUT_DEFAULT (16 KiB);
    # subsequent calls find the length already at capacity and
    # skip the pad entirely. On the rare oversize-frame case we
    # catch OutputOverflow, grow, and retry — much cheaper than
    # paying a per-call worst-case computation.
    if @scratch_out.bytesize < @scratch_out_capacity
      @scratch_out << ("\x00".b * (@scratch_out_capacity - @scratch_out.bytesize))
    end
    written = nil
    loop do
      written = H2Codec::CGlue.encoder_encode_v3(@ptr.to_i, headers, @scratch_out)
      break
    rescue H2Codec::OutputOverflow
      # Frame exceeded the running scratch capacity — double
      # and retry. The grown scratch persists for subsequent
      # calls so this is a one-time tax per encoder lifetime
      # (per oversized frame size class).
      @scratch_out_capacity *= 2
      @scratch_out = String.new(capacity: @scratch_out_capacity, encoding: Encoding::ASCII_8BIT)
      @scratch_out << ("\x00".b * @scratch_out_capacity)
    end
    # Single allocation: copy the encoded bytes out into an owned
    # String. byteslice on a binary String returns a new
    # ASCII-8BIT String of exactly `written` bytes.
    return @scratch_out.byteslice(0, written)
  end

  # v2 (Fiddle) fallback — kept verbatim from fix-B (2.2.x).
  # 1) Reset scratch buffers (length 0, capacity retained).
  @scratch_blob.clear
  argv_ints = @scratch_argv_ints
  argv_ints.clear

  # 2) Concatenate name+value bytes into one blob, recording
  # (name_off, name_len, value_off, value_len) quads as 4 ints.
  # Append to argv_ints in one go via a `pack('Q*')` at the end —
  # one transient String per call instead of per header.
  offset = 0
  headers.each do |name, value|
    # Avoid `.b` if the source is already binary-encoded — saves
    # one transient String per non-binary header. For frozen
    # binary literals (the common case in protocol-http2), this
    # is a near-zero-cost branch.
    ns = name.encoding == Encoding::ASCII_8BIT ? name : name.b
    vs = value.encoding == Encoding::ASCII_8BIT ? value : value.b
    name_len = ns.bytesize
    val_len = vs.bytesize

    argv_ints << offset << name_len << (offset + name_len) << val_len
    offset += name_len + val_len
    @scratch_blob << ns << vs
  end

  # 3) Pack all argv ints into the per-encoder scratch via the
  # `pack(buffer:)` keyword — Ruby reuses the existing String's
  # buffer (length-truncating to 0 first), so this is a zero-alloc
  # path on the steady state. The argv ints array itself reuses
  # the same Array allocation across calls (we `clear`ed it
  # above; capacity is retained by RArray internals).
  @scratch_argv.clear
  argv_ints.pack('Q*', buffer: @scratch_argv)

  argv_bytes = @scratch_argv.bytesize
  blob_bytes = @scratch_blob.bytesize

  # 3) Make sure the output scratch can hold the worst-case
  # encoded size. Reuse the existing buffer when it already fits;
  # only grow when a single frame exceeds the running capacity.
  worst_case = blob_bytes + (headers.length * 8) + 64
  if worst_case > @scratch_out_capacity
    new_cap = @scratch_out_capacity
    new_cap *= 2 while new_cap < worst_case
    @scratch_out = String.new(capacity: new_cap, encoding: Encoding::ASCII_8BIT)
    @scratch_out_capacity = new_cap
  end

  # 4) Refresh Fiddle pointers. `<<` and `clear` may have caused
  # MRI to reallocate the underlying String buffer (different
  # RSTRING_PTR), so the cached pointers can be stale. Refresh
  # them once per encode call — three Pointer wrapper objects vs
  # the v1 path's `2 * headers.length` Pointer wrappers.
  @scratch_blob_ptr = Fiddle::Pointer[@scratch_blob] if blob_bytes.positive?
  @scratch_argv_ptr = Fiddle::Pointer[@scratch_argv] if argv_bytes.positive?
  @scratch_out_ptr  = Fiddle::Pointer[@scratch_out]

  # 5) One FFI call. Returns bytes_written, -1 on overflow, -2 on bad args.
  written = H2Codec.encoder_encode_v2(@ptr,
                                      @scratch_blob_ptr, blob_bytes,
                                      @scratch_argv_ptr, headers.length,
                                      @scratch_out_ptr, @scratch_out_capacity)
  if written == -1
    raise H2Codec::OutputOverflow,
          "H2Codec encoder output buffer overflow (#{worst_case} bytes needed, " \
          "#{@scratch_out_capacity} available)"
  end
  raise "H2Codec encoder failed (rc=#{written})" if written.negative?

  # 6) Read `written` bytes from the C-written scratch into a
  # fresh ASCII-8BIT String. `Fiddle::Pointer#to_str(len)` copies
  # exactly `len` bytes once — this is the ONE unavoidable
  # allocation per encode call (Ruby strings can't alias
  # arbitrary memory, and the caller's contract is to receive an
  # owned String). Cheaper than v1 because we copy exactly
  # `len` bytes here instead of `capacity` bytes during
  # pre-fill + a `byteslice` of the encoded prefix.
  @scratch_out_ptr.to_str(written)
end