Class: Rubino::Compression::LogCompressor
- Inherits:
-
Object
- Object
- Rubino::Compression::LogCompressor
- Defined in:
- lib/rubino/compression/log_compressor.rb
Overview
Deterministic, ML-free compression of COMMAND OUTPUT (test runs, linters, build logs, long shell dumps) for the model-facing channel. Unlike the Ruby SKELETON path (whole-file reads), this is the high-ROI channel: the agent reads command output WHOLE, and the signal — failures + the summary tally —is a tiny fraction of the bytes. We keep every error/failure and the final tally VERBATIM and drop the passing-test / progress noise.
The FIDELITY INVARIANT is the whole contract: a line that names an ERROR / FAIL / Failure, and the final ‘N examples, M failures` summary, MUST survive compression. We may drop a passing dot, an INFO line, a green `describe` header — NEVER a failure descriptor. The measurement (eval/) and the spec both assert this directly.
Pure regex + counting, no AST, no gem. Small outputs (< min_lines) pass through unchanged — the marker indirection isn’t worth it.
Defined Under Namespace
Constant Summary collapse
- ERROR_RE =
— Severity lexicon (word-boundary anchored so ‘error_count` doesn’t trip on a passing-context line and ‘passed` doesn’t read as ‘pass`). —
/\b(?:error|errors|fail|failed|failure|failures|fatal|exception|panic|assert(?:ion)?)\b/i- WARN_RE =
/\b(?:warn|warning|warnings|deprecat\w+|pending|skipped|todo)\b/i- INFO_RE =
/\b(?:info|debug|pass|passed|passing|ok|success|done|examples?)\b/i- FAILURE_SHAPE_RE =
In a STRUCTURED test runner the per-test progress section names tests (“handles an error”, “fails fast”) whose keywords are NOT failures — the real failures live in dedicated report SHAPES. These match those shapes so the 8k-line green progress section can’t masquerade as 364 failures.
rspec: `Failure/Error:`, ` N) <desc>`, `rspec ./path:NN` rerun list pytest: `FAILED path::test`, `E <assert>`, `> assert ...` jest: `✕ test`, `● Component › test` cargo: `test name ... FAILED`, `---- name stdout ----` rubocop `path:line:col: C: Offense` %r{ \A\s*\d+\)\s # rspec/cargo numbered failure | \bFailure/Error: # rspec body anchor | \A\s*rspec\s+['"]?\.?/?\S+:\d+ # rspec rerun line | \A\s*FAILED\b # pytest / generic | \A\s*E\s{2,}\S # pytest assertion line | \A\s*[✕✗✘×]\s # jest/mocha fail mark | \A\s*[●•]\s.*› # jest failure header (› ) | \.{3}\s*FAILED\s*\z # cargo `test x ... FAILED` | \A----\s.*\bstdout\b # cargo failure capture header | \A\S+:\d+:\d+:\s+[A-Z]:\s # rubocop offense (path:l:c: C:) | \A\s*(?:error|panic)\[ # rust/compiler `error[E…]` }x- STACK_RE =
Stack-trace / backtrace frame: rspec ‘# ./spec/…:NN`, a bare `from path:line:in`, a `path:line:in` Ruby frame, or a `at File.fn` / pytest `File “x”, line N`. Indented continuation lines of a trace.
%r{ \A\s*(?:\#\s+)?(?:from\s+)?[^\s:]+\.\w+:\d+(?::in\b)? # path:line[:in] | \A\s*at\s+\S+ # JS/Java at frame | \A\s*File\s+"[^"]+",\s+line\s+\d+ # pytest frame }x- SUMMARY_RE =
The final tally / framing lines that MUST survive: rspec ‘N examples, M failures`, `Finished in …`, the `Failures:` header, rubocop’s ‘NN files inspected, MM offenses`, pytest’s ‘=== N failed ===`.
/ \b\d+\s+examples?\b | \bFinished\sin\b | \A\s*Failures:\s*\z | \b\d+\s+files?\s+inspected\b | \b\d+\s+offenses?\b | ^={3,}.*\b(?:failed|passed|error)\b | \b\d+\s+(?:passed|failed|error)\b /xi- STRUCTURED =
STRUCTURED runners report failures in dedicated shapes; the green progress section’s keyword-bearing test names are NOT failures. Generic logs have no such structure, so keyword severity is all we have.
%i[rspec pytest jest cargo].freeze
Instance Method Summary collapse
-
#compress(text) ⇒ Object
Returns a CompressionResult.
-
#initialize(config) ⇒ LogCompressor
constructor
A new instance of LogCompressor.
Constructor Details
#initialize(config) ⇒ LogCompressor
Returns a new instance of LogCompressor.
99 100 101 |
# File 'lib/rubino/compression/log_compressor.rb', line 99 def initialize(config) @cfg = config.is_a?(Config) ? config : Config.from(config) end |
Instance Method Details
#compress(text) ⇒ Object
Returns a CompressionResult. applied? == false means “send the original”.
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/rubino/compression/log_compressor.rb', line 104 def compress(text) original_bytes = text.bytesize raw = text.split("\n", -1) # split("\n", -1) leaves a trailing "" for a final newline; drop it so # the line count and the rebuilt output match the input. raw.pop if raw.last == "" && text.end_with?("\n") return CompressionResult.noop(strategy: :too_small) if raw.length < @cfg.min_lines @format = detect_format(raw) lines = classify(raw) select!(lines) kept = lines.select(&:kept) return CompressionResult.noop(strategy: :insufficient_saving) if kept.length >= raw.length out = render(lines) build_result(out, original_bytes) end |