rzstd
Ractor-safe Zstandard bindings for Ruby with persistent contexts.
rzstd provides Zstd frame compress/decompress at module level and a
stateful Dictionary class for dict-bound compression. Internally it
holds onto ZSTD_CCtx / ZSTD_DCtx state across calls instead of
allocating fresh ~256 KB contexts every time, which is what makes it
viable for small-message workloads where the upstream zstd-ruby gem
loses to LZ4 purely on context-allocation overhead.
API mirrors rlz4 0.2.x:
require "rzstd"
# Module-level frame compression
ct = RZstd.compress("the quick brown fox", level: 3) # level: kwarg, default 3
RZstd.decompress(ct) # => "the quick brown fox"
# Negative levels enable Zstd's fast strategy (trades ratio for speed).
# Supported range: -131072..22. Typical useful range: -7..19.
RZstd.compress(payload, level: -3) # fast strategy, low ratio
RZstd.compress(payload, level: 19) # high ratio, slow
# Dict-bound compression
dict = RZstd::Dictionary.new(File.binread("schema.dict"), level: -3)
dict.id # => u32 from sha256(dict)[0..4] LE
dict.size # => byte length
dict.compress("payload that shares the schema")
dict.decompress(ct)
# Dictionary training from sample payloads (wraps ZDICT_trainFromBuffer).
# Gather representative messages, then train a dictionary once and reuse
# it on both peers. Small-message workloads benefit the most.
samples = 1000.times.map { }
dict_bytes = RZstd::Dictionary.train(samples, capacity: 64 * 1024)
dict = RZstd::Dictionary.new(dict_bytes)
Dictionary#id is derived from sha256(dict_bytes)[0..4] interpreted
little-endian. It is intended for out-of-band peer negotiation
(e.g. via a dict:sha256:<hex> profile string in your application
protocol). Raw-content Zstd dictionaries always carry a frame dictID
of 0 by spec, so this id is not embedded in the on-wire frame itself.
Wrong-dict decoding is caught by the content checksum the encoder
enables — a peer using the wrong dictionary raises
RZstd::DecompressError instead of returning corrupt bytes.
Ractor safety
The extension is marked Ractor-safe. Dictionary instances are
shareable. Module-level RZstd.compress / RZstd.decompress use a
single global CCtx / DCtx behind a Mutex, which serializes
calls across Ractors — if you need parallel throughput, give each
Ractor its own Dictionary (each one owns its own per-instance
contexts).