blusher
A faster, drop-in, byte-for-byte-compatible alternative backend for Ruby's
rouge, powered by the Rust
carmine engine.
require "blusher" routes rouge's lexing — and, for the common case, its HTML
formatting — through carmine, which executes rule tables extracted from rouge's
own lexers. carmine either produces byte-identical output or declines,
in which case blusher falls back to rouge unchanged. Zero code change, zero
divergence; ~1.7× faster highlighting to HTML on a mixed corpus (more on
large files), parity elsewhere.
require "rouge"
require "blusher" # ← that's it
# The hot path: lexing + HTML formatting fused in Rust, one String returned.
html = Rouge.highlight(File.read("data.json"), "json", "html")
Performance
For the HTML-highlighting path — what rouge is overwhelmingly used for —
blusher is ~1.7× faster on a mixed real corpus and 2.5–2.7× on individual
files, with byte-identical output. Measured over rouge's own 126 routable
visual sample files (690 KiB) rendered to HTML with Rouge::Formatters::HTML:
ms/pass MB/s
rouge 187.1 3.8
blusher 110.2 6.4 → 1.70×
The trick is not faster lexing in isolation. A Ruby lexer's cost is
dominated by allocating Ruby objects at the boundary — one String + one Array
per token — which both engines pay identically, so lex alone is only ~1.0×
even though carmine's core is ~4.6× faster Rust-to-Rust. But a token stream is
just an intermediate: the real output is an HTML string. So blusher fuses
lexing and HTML formatting in Rust and returns one String, crossing the Ruby
boundary once (O(1)) instead of once per token (O(n)). That is where carmine's
speed finally shows up end-to-end. (Two supporting wins: a thread-local cache of
the parsed/compiled table so the ~0.5 ms build doesn't repeat per call, and
resolving CSS class names in Rust.)
Scope and honesty:
- The fast path applies when the whole pipeline is
format(lex(src))with an unadornedRouge::Formatters::HTML(theRouge.highlight/ Jekyll / kramdown default). Subclasses (HTMLInline, HTMLTable, Pygments, …), the token-streaming block form, and direct token consumers transparently fall back to rouge — same output, no speedup. - It applies to the 126 of 227 callback-free lexers (JSON, SQL, YAML, CSS,
many config/markup formats). The other 101 use rouge
procrules carmine can't execute; blusher detects this up front and runs rouge for them (parity, no wasted work). So a JSON/SQL/config-heavy workload wins big; a Ruby/Python-heavy one trends toward parity. - carmine's raw 4.6× is fully realized only Rust-to-Rust (e.g. embedded in rubyrs, no Ruby boundary at all).
How it works
require "blusher"aliases the originalRouge::RegexLexer#lexto__blusher_rouge_lexand replaces it. For a routable lexer (table exists and is callback-free),lexwithout a block returns a deferredBlusher::Shim::TokenStreamholding(lexer, source)— nothing is lexed yet.- The patched
Rouge::Formatters::HTML#formatrecognises that stream and calls the fusedBlusher::Engine.format_html, which lexes and formats in Rust and returns one HTML String. Any other consumer (a different formatter, the block form,.to_a) just iterates the stream, which lexes via carmine on demand and yields the same[Token, value]pairs rouge would. - carmine declines anything it can't reproduce identically (callback rules, recursion, …) and blusher falls back to rouge, so the output is always exactly rouge's — verified against the full lexer spec suite.
- The native backend is the rb-sys/magnus extension (
blusher.{bundle,so}). Acarmine-ffi+ Fiddle path is kept as a dependency-light fallback (it marshals tokens through JSON, can't fuse, and is for correctness only).
Correctness
Verified against rouge v5.0.0's full lexer spec suite: 757 runs, 5130
assertions, 0 failures (rake spec). The spec suite is the correctness gate —
any new divergence must be fixed in carmine or the rule forced to decline.
Build (dev, in the rubyrs monorepo)
rake compile # build the magnus ext → lib/blusher.<dlext>
rake compile_ffi # (optional) build the carmine-ffi cdylib fallback → ext/
rake tables # regenerate lib/blusher/tables/<tag>.json from installed rouge
ROUGE_SRC=/path/to/rouge rake spec
Status
Part of momiji-rs — Rust-backed engines for the Ruby ecosystem. Tables are derived from rouge (MIT, © Jeanine Adkisson and contributors).