Module: SmarterJSON
- Defined in:
- lib/smarter_json/backports.rb,
lib/smarter_json.rb,
lib/smarter_json/errors.rb,
lib/smarter_json/parser.rb,
lib/smarter_json/options.rb,
lib/smarter_json/version.rb,
lib/smarter_json/warning.rb,
lib/smarter_json/generator.rb,
ext/smarter_json/smarter_json.c
Overview
Refinement backport of Array#filter_map for Ruby < 2.7 (the gem supports >= 2.6.0).
filter_map shipped in Ruby 2.7. Rather than monkey-patching core Enumerable globally, this is a refinement scoped to the single file that needs it: parser.rb does ‘using SmarterJSON::Backports` (guarded to Ruby < 2.7). On 2.7+ the refinement is never activated, so the native © filter_map is used and this is a complete no-op.
DELETE this file, its require in lib/smarter_json.rb, and the ‘using` line in parser.rb once the minimum supported Ruby is >= 2.7.
Defined Under Namespace
Modules: Backports, Bytes, Framer, Options, Recovery Classes: EncodingError, Error, GenerateError, Generator, ParseError, Parser, Warning
Constant Summary collapse
- HAS_ACCELERATION =
respond_to?(:parse_c)
- VERSION =
"1.0.0"
Class Method Summary collapse
-
.generate(obj, options = {}) ⇒ Object
SmarterJSON.generate(obj, options = {}) — write a Ruby value as JSON.
-
.normalize_default_encoding(input, options) ⇒ Object
Smart default for the nil :encoding option.
- .parse_c(input, opts) ⇒ Object
-
.process(input, options = {}, &block) ⇒ Object
SmarterJSON.process(input, options = {}) — the main entry point.
-
.process_file(path, options = {}, &block) ⇒ Object
SmarterJSON.process_file(path, options = {}) — open a file and process it.
-
.process_one(input, options = {}) ⇒ Object
SmarterJSON.process_one(input, options = {}) — the single-document accessor.
Class Method Details
.generate(obj, options = {}) ⇒ Object
SmarterJSON.generate(obj, options = {}) — write a Ruby value as JSON.
:json (default) — standard JSON. Hash -> object, Array -> array,
scalar -> scalar. Always valid, interoperable JSON.
:ndjson — newline-delimited JSON. An Array writes one element per
line; any other value writes as a single line. The
inverse of process reading NDJSON back into an Array.
options: spaces per nesting level for pretty-printing (Integer, default
0 = compact). Empty objects/arrays stay inline. Not allowed with :ndjson (a
record must be a single line) — combining them raises ArgumentError.
Symbol keys/values are emitted as strings; BigDecimal as a JSON number. Unsupported types (Time, custom objects) and non-finite Floats raise SmarterJSON::GenerateError. Returns a String.
24 25 26 |
# File 'lib/smarter_json/generator.rb', line 24 def generate(obj, = {}) Generator.new().generate(obj) end |
.normalize_default_encoding(input, options) ⇒ Object
Smart default for the nil :encoding option. A String tagged ASCII-8BIT (BINARY) is how Net::HTTP and many HTTP libraries hand back a response body even when the bytes are UTF-8. JSON’s interchange encoding is UTF-8, so we relabel such input to UTF-8 when its bytes are valid UTF-8 — otherwise string values would come back tagged ASCII-8BIT and compare unequal to UTF-8 literals (a silent footgun). When the bytes are NOT valid UTF-8 we raise EncodingError rather than guess a legacy encoding — pass an explicit :encoding for that. An explicit (non-nil) :encoding, or any non-BINARY tag, is left untouched (the per-path force_encoding / validation handles it). Only relabels — never transcodes.
121 122 123 124 125 126 127 128 129 |
# File 'lib/smarter_json/parser.rb', line 121 def normalize_default_encoding(input, ) return input unless [:encoding].nil? return input unless input.encoding == Encoding::ASCII_8BIT utf8 = input.dup.force_encoding(Encoding::UTF_8) return utf8 if utf8.valid_encoding? raise EncodingError, "input is tagged ASCII-8BIT and is not valid UTF-8 — pass encoding: to declare its encoding" end |
.parse_c(input, opts) ⇒ Object
1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 |
# File 'ext/smarter_json/smarter_json.c', line 1548
static VALUE fj_parse_c(VALUE self, VALUE input, VALUE opts) {
fj_state st;
VALUE enc_opt, dk;
Check_Type(input, T_STRING);
enc_opt = rb_hash_aref(opts, fj_sym_encoding);
if (!NIL_P(enc_opt)) {
input = rb_funcall(rb_str_dup(input), fj_force_encoding_id, 1, enc_opt);
}
if (!RTEST(rb_funcall(input, fj_valid_encoding_p_id, 0))) {
VALUE name = rb_funcall(rb_funcall(input, fj_encoding_id, 0), fj_name_id, 0);
VALUE msg = rb_sprintf("invalid byte sequence for %" PRIsVALUE, name);
rb_exc_raise(rb_funcall(cEncodingError, fj_new_id, 3, msg, Qnil, Qnil));
}
st.buf = RSTRING_PTR(input);
st.len = RSTRING_LEN(input);
st.pos = 0;
st.enc = rb_enc_get(input);
st.depth = 0;
#ifdef HAVE_RB_ENC_INTERNED_STR
fj_kc_slot kcache[FJ_KCACHE_SIZE];
memset(kcache, 0, sizeof(kcache));
st.kcache = kcache;
#else
st.kcache = NULL;
#endif
st.symbolize_keys = RTEST(rb_hash_aref(opts, fj_sym_symbolize_keys));
dk = rb_hash_aref(opts, fj_sym_duplicate_key);
st.dup_first_wins = (dk == fj_sym_first_wins);
{
VALUE bd = rb_hash_aref(opts, fj_sym_decimal_precision);
if (bd == fj_sym_float) st.decimal_precision = 0;
else if (bd == fj_sym_bigdecimal) st.decimal_precision = 2;
else st.decimal_precision = 1; /* :auto (default), including nil */
}
st.on_warning = rb_hash_aref(opts, fj_sym_on_warning); /* Qnil when absent */
if (st.len >= 3 && (unsigned char)st.buf[0] == 0xEF &&
(unsigned char)st.buf[1] == 0xBB && (unsigned char)st.buf[2] == 0xBF) {
st.pos = 3;
}
/* With a block: yield each top-level document until EOF and return the document
* count (NDJSON / JSONL / concatenated). Same loop as the Ruby each_value path. */
if (rb_block_given_p()) {
long count = 0;
for (;;) {
VALUE v;
fj_skip_document_separators(&st);
if (fj_eof(&st)) break;
v = fj_parse_iter(&st, fj_implicit_root_ahead(&st));
fj_enforce_scalar_boundary(&st, v);
rb_yield(v);
count++;
}
return LONG2NUM(count);
}
/* No block: always return an Array of every top-level document (0 -> [], 1 ->
* [doc], 2+ -> [d1, d2, …]) — the always-array contract. Documents are separated by
* newline / comma / concatenation (self-delimiting values); a space alone never
* separates, and a bare scalar must be followed by a real separator, so `1 2 3`
* raises while `1\n2\n3` and `1, 2, 3` are three documents. */
{
VALUE arr = rb_ary_new();
for (;;) {
VALUE v;
fj_skip_document_separators(&st);
if (fj_eof(&st)) break;
v = fj_parse_iter(&st, fj_implicit_root_ahead(&st));
fj_enforce_scalar_boundary(&st, v);
rb_ary_push(arr, v);
}
return arr;
}
}
|
.process(input, options = {}, &block) ⇒ Object
SmarterJSON.process(input, options = {}) — the main entry point.
‘input` is either a String of JSON content or an IO to read from. (A String is always content, never a filename — use process_file for paths.) The values in `options` override Parser::DEFAULT_OPTIONS.
Without a block: always returns an Array of the documents found — [] for none,
- doc
-
for one, [d1, d2, …] for several (NDJSON / JSONL / concatenated). A
top-level value must be a recognized JSON value (number / literal / quoted string / object / array) or an implicit-root object, else it raises. For the single-document case use SmarterJSON.process_one (returns the bare value). :acceleration (default true) selects the C extension when compiled and loaded (SmarterJSON::HAS_ACCELERATION); otherwise the pure-Ruby parser.
With a block: yields each top-level document as it is parsed, and returns the document count. For an IO this streams document-by-document in bounded memory —it reads the stream as newline-delimited documents (NDJSON / JSONL), one per line.
31 32 33 34 35 36 37 38 39 40 |
# File 'lib/smarter_json/parser.rb', line 31 def process(input, = {}, &block) = Options.() if input.is_a?(String) Recovery.process_string(input, , &block) elsif input.respond_to?(:read) block ? stream_io(input, , &block) : process(input.read, ) else raise ArgumentError, "SmarterJSON.process expects a String or an IO, got #{input.class}" end end |
.process_file(path, options = {}, &block) ⇒ Object
SmarterJSON.process_file(path, options = {}) — open a file and process it.
The :encoding option labels the file’s encoding (default “UTF-8”); it does NOT trigger a transcoding pass — the parser works on the bytes in their native encoding and emits string values with the same encoding tag. With a block, streams document-by-document straight from disk in bounded memory (never loading the whole file); the documents are read as newline-delimited (NDJSON / JSONL), one per line.
50 51 52 53 54 55 56 57 58 |
# File 'lib/smarter_json/parser.rb', line 50 def process_file(path, = {}, &block) = Options.() encoding = [:encoding] || "UTF-8" if block File.open(path, "r:#{encoding}") { |io| stream_io(io, , &block) } else process(File.read(path, encoding: encoding), ) end end |
.process_one(input, options = {}) ⇒ Object
SmarterJSON.process_one(input, options = {}) — the single-document accessor.
Returns the first document’s value (or nil when the input holds no documents). When the input holds MORE than one document it returns the first and warns once — it never raises, since an extra document is valid data; the warning goes to on_warning if set, else Rails.logger.warn when Rails is loaded, else Kernel#warn. For an IO this is bounded memory: it parses just the first document and stops as soon as a second is seen, instead of materialising the whole stream the way process(io).first would. (process(input).first and process(input) silently drop documents 2+ — a footgun; use process_one instead.)
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
# File 'lib/smarter_json/parser.rb', line 70 def process_one(input, = {}) = Options.() # IO: bounded memory — parse just the first document and stop once a second is # seen (peek-to-warn). A String is already in memory, so use the plain no-block # path: it returns the full (wrapper-recovered, de-duplicated) Array in one pass, # which also avoids the reactive-recovery double-yield the block path would hit. unless input.respond_to?(:read) docs = process(input, ) warn_extra_documents() if docs.length > 1 return docs.first end first = nil count = 0 catch(:smarter_json_first_document) do process(input, ) do |doc| count += 1 first = doc if count == 1 throw(:smarter_json_first_document) if count > 1 end end warn_extra_documents() if count > 1 first end |