rzstd

Gem Version License: MIT Ruby Rust

Ractor-safe Zstandard bindings for Ruby with persistent contexts.

rzstd provides Zstd frame compress/decompress at module level and a stateful Dictionary class for dict-bound compression. Internally it holds onto ZSTD_CCtx / ZSTD_DCtx state across calls instead of allocating fresh ~256 KB contexts every time, which is what makes it viable for small-message workloads where the upstream zstd-ruby gem loses to LZ4 purely on context-allocation overhead.

API mirrors rlz4 0.2.x:

require "rzstd"

# Module-level frame compression
ct = RZstd.compress("the quick brown fox", level: 3)  # level: kwarg, default 3
RZstd.decompress(ct)                                  # => "the quick brown fox"

# Negative levels enable Zstd's fast strategy (trades ratio for speed).
# Supported range: -131072..22. Typical useful range: -7..19.
RZstd.compress(payload, level: -3)                    # fast strategy, low ratio
RZstd.compress(payload, level: 19)                    # high ratio, slow

# Dict-bound compression
dict = RZstd::Dictionary.new(File.binread("schema.dict"), level: -3)
dict.id                                               # => u32 from sha256(dict)[0..4] LE
dict.size                                             # => byte length
dict.compress("payload that shares the schema")
dict.decompress(ct)

# Dictionary training from sample payloads (wraps ZDICT_trainFromBuffer).
# Gather representative messages, then train a dictionary once and reuse
# it on both peers. Small-message workloads benefit the most.
samples = 1000.times.map { generate_sample_message }
dict_bytes = RZstd::Dictionary.train(samples, capacity: 64 * 1024)
dict = RZstd::Dictionary.new(dict_bytes)

Dictionary#id is derived from sha256(dict_bytes)[0..4] interpreted little-endian. It is intended for out-of-band peer negotiation (e.g. via a dict:sha256:<hex> profile string in your application protocol). Raw-content Zstd dictionaries always carry a frame dictID of 0 by spec, so this id is not embedded in the on-wire frame itself. Wrong-dict decoding is caught by the content checksum the encoder enables — a peer using the wrong dictionary raises RZstd::DecompressError instead of returning corrupt bytes.

Ractor safety

The extension is marked Ractor-safe. Dictionary instances are shareable. Module-level RZstd.compress / RZstd.decompress use a single global CCtx / DCtx behind a Mutex, which serializes calls across Ractors — if you need parallel throughput, give each Ractor its own Dictionary (each one owns its own per-instance contexts).