Class: Rubino::Compression::Compressor

Inherits:
Object
  • Object
show all
Defined in:
lib/rubino/compression/compressor.rb

Overview

Entry point that routes a piece of tool-read content to a compression STRATEGY by content type and returns a CompressionResult. Phase 1 handles exactly one type — Ruby source (‘:code`) via the Prism skeletoner; every other content type is a deterministic no-op (`applied? == false`), so the caller sends the original.

Conservative GUARDS (all must hold to attempt a skeleton):

- full_file:  only a WHOLE-file read is compressible — a targeted
              offset/limit read is a drill-in and stays VERBATIM.
- content_type == :code AND the source parses as Ruby.
- total lines >= min_lines (small files aren't worth the indirection).
- the skeleton actually saves >= MIN_SAVING_RATIO of the bytes; below
  that the pointers cost more than they save, so we send the original.

On any guard miss (or a Prism parse failure) we return a no-op result whose ‘strategy` records the reason — never a misleading/partial skeleton.

Constant Summary collapse

MIN_SAVING_RATIO =

Below this fractional byte saving the skeleton isn’t worth the drill-in round-trips it forces; send the original instead.

0.25
STRATEGIES =

Per-language skeleton strategies, keyed by language symbol. A LineSkeleton subclass per language; Ruby (Prism built-in), Python (shell-out to the python3 ‘ast` stdlib, no-op when python3 is absent), and JS/TS/TSX (the optional `tree_sitter_language_pack` gem’s ‘process` API, no-op when the gem/grammar is absent). The rest of the pipeline is language-agnostic.

{
  ruby: RubyCodeSkeleton,
  python: PythonCodeSkeleton,
  javascript: JavascriptCodeSkeleton,
  typescript: TypescriptCodeSkeleton,
  tsx: TsxCodeSkeleton
}.freeze

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(min_lines:, keep_method_body_max_lines:) ⇒ Compressor

Returns a new instance of Compressor.



44
45
46
47
48
# File 'lib/rubino/compression/compressor.rb', line 44

def initialize(min_lines:, keep_method_body_max_lines:)
  @min_lines = min_lines.to_i
  @keep_method_body_max_lines = keep_method_body_max_lines.to_i
  @elided_ranges = []
end

Instance Attribute Details

#elided_rangesObject (readonly)

The exact (1-based first line, line count) ranges the skeleton elided. Carried OUT of #compress via an attr so the caller can record them for drill-in detection without threading another return value.



42
43
44
# File 'lib/rubino/compression/compressor.rb', line 42

def elided_ranges
  @elided_ranges
end

Instance Method Details

#compress(content, source_path:, content_type:, full_file:, language: :ruby) ⇒ Object

content — the raw file text (NOT line-numbered) source_path — display path embedded in pointer lines (‘read <path> …`) content_type — :code is the only compressible type in Phase 1 full_file — true only for a whole-file read (no offset/limit) language — which per-language skeleton strategy to use (default :ruby

so existing direct callers are unaffected); an
unregistered language is a no-op passthrough.


57
58
59
60
61
62
63
64
65
66
67
# File 'lib/rubino/compression/compressor.rb', line 57

def compress(content, source_path:, content_type:, full_file:, language: :ruby)
  original_bytes = content.bytesize

  return CompressionResult.noop(strategy: :not_full_file) unless full_file
  return CompressionResult.noop(strategy: :not_code) unless content_type == :code

  line_count = content.count("\n") + (content.end_with?("\n") ? 0 : 1)
  return CompressionResult.noop(strategy: :too_small) if line_count < @min_lines

  skeletonise(content, source_path, original_bytes, language)
end