SmarterJSON
A lenient, fast JSON parser for Ruby. It parses strict JSON, JSON5, HJSON-style config, and the messy JSON-ish input humans actually write — and in benchmarks it matches or beats Oj on nearly every file. SmarterJSON is opinionated: we want your JSON processing to be successful. Other parsers are strict - they stop at the first deviation - SmarterJSON keeps going - it optimizes for getting your data out, not for policing the JSON spec.
Why SmarterJSON?
Most JSON parsers reject anything that isn't perfectly strict JSON. SmarterJSON is built on the opposite principle: you shouldn't have to care what flavor of JSON you were handed. Give it strict JSON, JSON5, an HJSON-style config file, newline-delimited JSON, or a copy-pasted blob with comments and trailing commas — it just parses it.
Three things set it apart:
One parser, no modes, no flags. There is no
dialect:option and no "strict mode" —SmarterJSON.process(input)accepts the whole superset, and strict JSON is simply the narrowest case. You don't configure the parser to match your input; it adapts to whatever you give it.It parses multi-document input automatically — a distinguishing feature.
SmarterJSON.processhandles NDJSON / JSONL / concatenated JSON with no block and no special method: one document returns its value, several documents return anArray, empty input returnsnil. Only SmarterJSON parses multi-document input via plainprocess— Oj and the stdlibjsonlibrary raise without a block. For input larger than memory, pass a block to stream one document at a time.It's fast. A C extension (with a pure-Ruby fallback that runs everywhere) puts it ahead of Oj on nearly every file we benchmark, and competitive with the stdlib
jsonC parser — the fastest general-purpose Ruby JSON parser.
What it accepts, beyond strict JSON
//,/* … */, and#comments (a#///only starts a comment when preceded by whitespace, sourl: http://x.comparses as a string, not a truncated value)- Trailing commas; unquoted keys (
{host: localhost}); single-quoted, triple-quoted ('''…'''), and quoteless string values - Implicit root object — a config file that starts with
key: value, no outer{} NaN,Infinity, hex (0xFF), leading+/., underscores in numbers (1_000_000)- UTF-8 BOM, smart/curly quotes, Python literals (
True/False/None), JavaScriptundefined - Mixed CR / LF / CRLF line endings, and any Ruby-supported input encoding (via
encoding:) - Duplicate keys (last value wins by default; configurable)
It raises only on genuinely unparseable input (unterminated string, mismatched bracket), with line and column in the message — never on valid-but-lenient input.
Installation
# Gemfile
gem "smarter_json"
gem install smarter_json
The C extension is built on install and used automatically. On platforms where it can't build, the pure-Ruby parser runs instead and produces identical results.
Documentation
Usage
require "smarter_json"
SmarterJSON.process('{"a": 1, "b": [2, 3]}') # => {"a"=>1, "b"=>[2, 3]}
SmarterJSON.process("host: localhost\nport: 5432") # => {"host"=>"localhost", "port"=>5432} (no braces needed)
SmarterJSON.process_file("config.json5") # read a file, then parse
# Multiple documents (NDJSON / JSONL / concatenated) — no block, no special method:
SmarterJSON.process(%({"id":1}\n{"id":2}\n{"id":3})) # => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
SmarterJSON.process('{"id":1}') # => {"id"=>1} (one document → the value itself)
SmarterJSON.process("") # => nil (zero documents)
# For input larger than memory, stream one document at a time with a block
# (process and process_file both forward the block):
SmarterJSON.process_file("events.ndjson") { |event| EventJob.perform_async(event) }
Options
| option | default | meaning |
|---|---|---|
symbolize_keys |
false |
return object keys as Symbols instead of Strings |
duplicate_key |
:last_wins |
:last_wins / :first_wins / :raise for repeated keys in one object |
bigdecimal_load |
:auto |
:auto keeps high-precision decimals as BigDecimal; :float forces Float; :bigdecimal forces BigDecimal |
acceleration |
true |
true uses the C extension when compiled and loadable; false forces pure Ruby (identical results) |
encoding |
"UTF-8" |
labels the input's encoding (no transcoding pass; see below) |
Performance
Benchmarks: p10 of 40 runs, Apple M1 Max, Ruby 3.4.7, on the standard JSON corpus (canada, citm_catalog, twitter, github_events, …). The apples-to-apples comparisons are SmarterJSON/C vs Oj/strict vs stdlib json, all producing Float (run rake report in json_benchmarks/ for the full table — numbers vary run to run).
- vs Oj: SmarterJSON/C matches or beats Oj on nearly every file — typically 1.1–1.7× faster (e.g. deeply-nested ~1.7×, citm ~1.3×, twitter ~1.3×, usgs/weather ~1.2–1.3×).
- vs stdlib
json(C): competitive with the fastest Ruby JSON parser — it matchesjsonon number- and string-heavy files (e.g. big_decimals, string_array) and trails by ~1.2–1.6× on others. - Numbers: floats are parsed with Ryū (correctly rounded, single-pass), so number-heavy data is fast and bit-exact.
Two notes on fair comparison:
- NDJSON: on multi-document files, only SmarterJSON parses the input via plain
process— Oj andjsonraise without a block, so their cells areN/A. ThatN/Areflects real default behavior, not a measurement gap. Plainprocesscollects every document into an Array at ~270 MB/s; the streaming block form runs faster (~440 MB/s) because it doesn't hold all documents in memory at once — use it for input larger than RAM. - High-precision decimals (e.g.
canada.json): SmarterJSON's default:automode preserves high-precision numbers asBigDecimal(matching Oj's default), which is intrinsically slower thanFloat. AgainstFloat-producing parsers it looks slower on such files; passbigdecimal_load: :floatto compare like-for-like (it then runs much faster). Against the equivalentBigDecimal-producing Oj mode, SmarterJSON is faster.
Encoding
encoding: (default "UTF-8") labels what the input is — it does not trigger a transcoding pass. The parser works on the bytes in their native encoding and emits string values with the same encoding tag, the same way smarter_csv handles encodings. Bytes that are invalid for the claimed encoding raise SmarterJSON::EncodingError (a kind of SmarterJSON::ParseError).
Nesting & untrusted input
Both the C extension and the pure-Ruby parser are iterative, not recursive — they track nesting on an explicit, heap-allocated stack rather than the call stack. So deeply nested input cannot overflow the call stack or segfault: nesting is bounded only by available memory, the same posture as Oj (which also ships no nesting limit; the stdlib json caps at 100). The deeply_nested.json benchmark (212 MB of nesting) parses without issue.
The trade-off: there is currently no fixed nesting or input-size limit, so extremely large or adversarially-nested untrusted input is bounded by memory (it can exhaust RAM), not by a crash. If you parse untrusted input and want a hard cap, that's a planned opt-in guard — for now, size-limit upstream of the parser.
Development
After checking out the repo, run bin/setup to install dependencies, then rake compile to build the C extension and rake spec to run the tests. The test suite runs every example against both the C and pure-Ruby paths, so the two stay behavior-identical.
License
Available as open source under the terms of the MIT License.