MultiCompress 🗜️

Status: functional, well-tested, and actively evolving. The current release is suitable for real workloads, but the API and implementation details are still being refined in upcoming releases.

Modern compression technology: zstd, lz4, brotli — unified compression platform with native C performance, fiber-friendly for modern async Ruby stacks.

Bundled library versions in the current release:

zstd 1.5.2
lz4 1.10.0
brotli 1.1.0

📖 Get Started → — Complete technology overview, algorithms, and implementation details

Technology Overview

MultiCompress is a comprehensive compression system that unites three cutting-edge algorithms in a single platform. Modern algorithms are 3–10x faster than traditional zlib while providing superior compression ratios.

Algorithm	Strength	Best for
zstd	Best speed/ratio balance	Cache, logs, backups
lz4	Fastest compress/decompress	IPC, hot cache, real-time
brotli	Best ratio for HTTP	Web assets, API responses

How It Works

MultiCompress packages modern compression algorithms (zstd, lz4, brotli) with their native C libraries, providing a unified interface. The system includes vendored sources of all compression libraries, eliminating external dependencies.

Key Design Principles

Dictionary support: Runtime dictionary use is supported for zstd and brotli; Zstd dictionary training is available in the current release line
Zero external dependencies: All C libraries are vendored and compiled
Unified API: Same interface for all algorithms — just change the algo: parameter
Performance first: Direct bindings to C libraries, minimal overhead
Fiber-friendly: Compression and decompression cooperate with Ruby's fiber scheduler — safe to use under async, falcon, or any Fiber::Scheduler-based runtime without blocking the event loop. See GET_STARTED.md for details and examples.
Memory efficient: Streaming support for large datasets, proper resource cleanup
Operationally focused: Clear errors, comprehensive tests, and streaming support for practical workloads

Algorithm Auto-Detection

The system can automatically detect compression algorithms when decompressing data:

zstd: Detected by magic bytes 28 B5 2F FD (little-endian)
lz4: Detected by internal format header validation (custom internal format, NOT compatible with the standard lz4 CLI; optional standard frame support may be added in a future release)
brotli: Requires explicit algo: :brotli parameter - no auto-detection

Important: Auto-detection only works for ZSTD and LZ4. Brotli data must be decompressed with explicit algorithm specification.

Security: Decompression now enforces a default 256MB output cap, cumulative streaming limits, a default ratio guard of 1000:1, and a 32MB dictionary file size cap.

Security limits

Decompression-facing APIs support conservative defaults intended to protect against decompression bombs and accidental resource spikes:

Default output cap: 256MB
Streaming cumulative cap: enforced across the lifetime of an Inflater/Reader
Default ratio guard: 1000:1
Trusted-input opt-out: pass max_ratio: nil
Dictionary file size cap: 32MB for MultiCompress::Dictionary.load

Examples:

MultiCompress.decompress(blob, algo: :zstd, max_output_size: 64 * 1024 * 1024)
MultiCompress.decompress(blob, algo: :brotli, max_ratio: nil)

MultiCompress::Reader.open("archive.zst", max_output_size: 128 * 1024 * 1024, max_ratio: 500) do |reader|
  puts reader.read
end

max_output_size: nil keeps the native default cap of 256MB. max_ratio: nil disables the ratio guard for trusted input.

Algorithm Comparison

Algorithm	Speed	Ratio	Best Use Case
lz4	Fastest	Good	Real-time processing, IPC, hot cache paths
zstd	Fast	Excellent	General purpose, logs, backups, web APIs
brotli	Slower	Best	Static assets, CDN, long-term storage

Benchmark Results

📝 Note on v0.2.0: Performance numbers below are from the v0.2.0 build with fiber-friendly paths enabled. There is no throughput regression compared to v0.1.2 — the fiber-friendly path is only taken when a Fiber::Scheduler is active, and even then the worker-thread overhead is negligible for payloads large enough to benefit.

Performance comparison against Ruby's built-in zlib compression (200 iterations per test):

🗜️ COMPRESSION RATIO (%, lower is better)

┌─────────────────────────────┬─────────┬─────────┬─────────┬─────────┐
│ Configuration               │  zlib   │   lz4   │  zstd   │ brotli  │
├─────────────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ Small JSON (~10KB, GC)      │    9.4% │   16.1% │    6.9% │    6.1% │
│ Small text (~10KB, GC)      │    3.1% │    4.6% │    3.2% │    2.6% │
│ Small JSON (~10KB, no GC)   │    9.4% │   16.1% │    6.9% │    6.1% │
│ Small text (~10KB, no GC)   │    3.1% │    4.6% │    3.2% │    2.6% │
│ Medium JSON (~370KB, GC)    │    8.5% │   15.7% │    6.7% │    5.5% │
│ Medium logs (~168KB, GC)    │    8.6% │   17.2% │    5.4% │    3.2% │
│ Medium JSON (~370KB, no GC) │    8.5% │   15.7% │    6.7% │    5.5% │
│ Medium logs (~168KB, no GC) │    8.6% │   17.2% │    5.4% │    3.2% │
│ Large JSON (~1.6MB, GC)     │    8.1% │   15.1% │    6.1% │    5.6% │
│ Large logs (~600KB, GC)     │    7.6% │   16.0% │    2.8% │    2.1% │
│ Large JSON (~1.6MB, no GC)  │    8.1% │   15.1% │    6.1% │    5.6% │
│ Large logs (~600KB, no GC)  │    7.6% │   16.0% │    2.8% │    2.1% │
└─────────────────────────────┴─────────┴─────────┴─────────┴─────────┘

⚡ TOTAL TIME (compress + decompress, ms — lower is faster)

┌─────────────────────────────┬─────────┬─────────┬─────────┬─────────┐
│ Configuration               │  zlib   │   lz4   │  zstd   │ brotli  │
├─────────────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ Small JSON (~10KB, GC)      │    0.05 │    0.01 │    0.02 │    0.14 │
│ Small text (~10KB, GC)      │    0.04 │    0.00 │    0.01 │    0.11 │
│ Small JSON (~10KB, no GC)   │    0.06 │    0.01 │    0.02 │    0.13 │
│ Small text (~10KB, no GC)   │    0.04 │    0.00 │    0.01 │    0.11 │
│ Medium JSON (~370KB, GC)    │    2.73 │    0.29 │    0.42 │    2.36 │
│ Medium logs (~168KB, GC)    │    1.23 │    0.14 │    0.18 │    0.92 │
│ Medium JSON (~370KB, no GC) │    2.72 │    0.28 │    0.41 │    2.41 │
│ Medium logs (~168KB, no GC) │    1.26 │    0.13 │    0.18 │    0.96 │
│ Large JSON (~1.6MB, GC)     │   12.44 │    1.38 │    1.96 │   12.44 │
│ Large logs (~600KB, GC)     │    4.29 │    0.46 │    0.49 │    2.85 │
│ Large JSON (~1.6MB, no GC)  │   12.22 │    1.28 │    1.86 │   11.83 │
│ Large logs (~600KB, no GC)  │    4.39 │    0.42 │    0.44 │    2.86 │
└─────────────────────────────┴─────────┴─────────┴─────────┴─────────┘

📊 SPEEDUP vs ZLIB (higher is better)

┌─────────────────────────────┬─────────┬─────────┬─────────┬─────────┐
│ Configuration               │  zlib   │   lz4   │  zstd   │ brotli  │
├─────────────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ Small JSON (~10KB, GC)      │   1.00x │   5.00x │   2.50x │   0.36x │
│ Small text (~10KB, GC)      │   1.00x │     N/A │   4.00x │   0.36x │
│ Small JSON (~10KB, no GC)   │   1.00x │   6.00x │   3.00x │   0.46x │
│ Small text (~10KB, no GC)   │   1.00x │     N/A │   4.00x │   0.36x │
│ Medium JSON (~370KB, GC)    │   1.00x │   9.41x │   6.50x │   1.16x │
│ Medium logs (~168KB, GC)    │   1.00x │   8.79x │   6.83x │   1.34x │
│ Medium JSON (~370KB, no GC) │   1.00x │   9.71x │   6.63x │   1.13x │
│ Medium logs (~168KB, no GC) │   1.00x │   9.69x │   7.00x │   1.31x │
│ Large JSON (~1.6MB, GC)     │   1.00x │   9.01x │   6.35x │   1.00x │
│ Large logs (~600KB, GC)     │   1.00x │   9.33x │   8.76x │   1.51x │
│ Large JSON (~1.6MB, no GC)  │   1.00x │   9.55x │   6.57x │   1.03x │
│ Large logs (~600KB, no GC)  │   1.00x │  10.45x │   9.98x │   1.53x │
└─────────────────────────────┴─────────┴─────────┴─────────┴─────────┘

Dependencies for benchmarking:

memory_profiler — Memory usage analysis
benchmark-ips — Iterations per second benchmarking

Or use the build script:

./build.sh

Requirements

Ruby >= 3.1.0
C compiler (gcc, clang)

Contributing

Fork it
Create your feature branch (git checkout -b feature/my-feature)
Commit your changes (git commit -am 'Add feature')
Push to the branch (git push origin feature/my-feature)
Create a Pull Request

License

MIT — see LICENSE.txt.