Class: Rubino::Tools::ReadTool

Inherits:
Base
  • Object
show all
Defined in:
lib/rubino/tools/read_tool.rb

Overview

Reads a file with ‘cat -n` style line numbers, offset/limit windowing, and a hard cap on per-line length. Line numbers let the LLM cite or edit exact lines instead of “the second occurrence of X”; offset/limit let it page through files that would otherwise blow the context.

Constant Summary collapse

DEFAULT_LIMIT =
2000
MAX_LINE_WIDTH =
2000
MAX_OUTPUT_BYTES =

Hard cap on the bytes a single read returns (~25k tokens at 4 bytes/tok, matching Claude Code’s read gate). A window of 2000 lines × 2000 chars could otherwise build multiple MB in memory and blow up prefill/TTFT; past this we stop and tell the model to narrow the range or grep.

100_000

Instance Attribute Summary

Attributes inherited from Base

#cancel_token, #read_tracker, #stream_chunk, #stream_kind

Instance Method Summary collapse

Methods inherited from Base

#cancellation_requested?, #config_key, #display_name, #emit_chunk, #mcp?, #risky?, #to_tool_definition, workspace_root, workspace_roots

Instance Method Details

#call(arguments) ⇒ Object



68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
# File 'lib/rubino/tools/read_tool.rb', line 68

def call(arguments)
  file_path  = arguments["file_path"] || arguments[:file_path]
  raw_offset = arguments["offset"] || arguments[:offset]
  raw_limit  = arguments["limit"]  || arguments[:limit]
  offset     = (raw_offset || 1).to_i
  limit      = (raw_limit  || DEFAULT_LIMIT).to_i
  # A WHOLE-file read (no offset AND no limit supplied) is exploration and
  # the ONLY thing compression touches. A read carrying EITHER is a
  # targeted window — the drill-in path — which always returns verbatim.
  full_file = raw_offset.nil? && raw_limit.nil?

  return "Error: file_path is required" if file_path.nil? || file_path.to_s.empty?

  expanded = expand_workspace_path(file_path)
  # Secret-file READ block, ported 1:1 from Hermes' get_read_block_error:
  # the project-local .env family anywhere on disk, plus the agent-home
  # credential stores, are blocked-with-message (no content). Checked
  # BEFORE existence so we don't leak whether the secret file is present.
  # Defense-in-depth, not a boundary — the shell can still `cat .env`,
  # where the value is REDACTED (Security::Redactor).
  if (block = Security::SecretPath.read_block_error(expanded))
    return { output: block, error_code: :secret_read_blocked }
  end
  return "Error: File not found: #{file_path}" unless File.exist?(expanded)
  return "Error: Not a regular file: #{file_path}" unless File.file?(expanded)

  if binary?(expanded)
    size = File.size(expanded)
    return { output: "Error: #{file_path} appears to be a binary file (#{size} bytes). " \
                     "Reading it as text would corrupt the buffer. " \
                     "Use the shell tool with xxd/file/strings for inspection.",
             error_code: :binary_file }
  end

  offset = 1 if offset < 1
  limit  = DEFAULT_LIMIT if limit <= 0

  # Stash mtime + content hash BEFORE rendering so a slow render on a huge
  # file doesn't race with a concurrent writer — we want the state the
  # model "saw", not the one at end-of-render. The hash is the single
  # source of truth the edit-gate and dedup both consult.
  mtime  = File.mtime(expanded)
  digest = Digest::SHA256.hexdigest(File.binread(expanded))
  @read_tracker&.register(expanded, mtime, digest)

  # Re-reading the exact same window of UNCHANGED bytes just re-injects
  # content already in context. Skip the work with a nudge — but only when
  # the file still hashes the same, the TTL holds, and no edit-failure
  # recovery is pending (those serve fresh content). See ReadTracker.
  if @read_tracker&.duplicate_read?(expanded, offset, limit, digest)
    return { output: "[DUPLICATE READ] Exact repeat of an earlier read of #{file_path} " \
                     "(lines #{offset}-#{offset + limit - 1}) — reuse that result " \
                     "instead of re-reading.",
             metrics: "duplicate" }
  end

  # A TARGETED read of a file we previously skeletonised that lands inside
  # an elided range is a DRILL-IN: the model needed a body the skeleton
  # hid. Log it (the "did the skeleton hide what was needed" signal) — the
  # verbatim windowed bytes are then served unchanged below.
  if !full_file && @read_tracker&.drill_in?(expanded, offset, limit)
    Rubino.logger&.info(event: "compression.drill_in", path: file_path,
                        offset: offset, limit: limit)
  end

  render(expanded, file_path, offset, limit, full_file: full_file)
rescue StandardError => e
  "Error reading #{file_path}: #{e.message}"
end

#compress_paramObject



53
54
55
56
# File 'lib/rubino/tools/read_tool.rb', line 53

def compress_param
  { type: "boolean",
    description: "Set false to skip compression and read the verbatim file (default true)." }
end

#compression_enabled?Boolean

Returns:

  • (Boolean)


58
59
60
61
62
# File 'lib/rubino/tools/read_tool.rb', line 58

def compression_enabled?
  Rubino.configuration.tool_output_compression_enabled?
rescue StandardError
  false
end

#compression_noteObject

Advertised only when the feature is on: a one-line note explaining that a whole-file Ruby read may be skeletonised, how to opt out, and that the full file is always retrievable.



45
46
47
48
49
50
51
# File 'lib/rubino/tools/read_tool.rb', line 45

def compression_note
  return "" unless compression_enabled?

  " A whole-file Ruby read may be returned as a SKELETON (signatures kept, " \
    "large bodies elided behind a pointer) to save tokens; the original is always " \
    "retrievable via the read pointer. Pass compress:false to force the verbatim file."
end

#descriptionObject



24
25
26
27
28
29
30
# File 'lib/rubino/tools/read_tool.rb', line 24

def description
  base = "Read a text file from the filesystem with line numbers (cat -n style). " \
         "Supports offset (1-based start line) and limit (max lines returned). " \
         "Long lines are truncated at #{MAX_LINE_WIDTH} chars. " \
         "Default window: first #{DEFAULT_LIMIT} lines."
  base + compression_note
end

#input_schemaObject



32
33
34
35
36
37
38
39
40
# File 'lib/rubino/tools/read_tool.rb', line 32

def input_schema
  props = {
    file_path: { type: "string", description: "Absolute or relative file path" },
    offset: { type: "integer", description: "1-based line to start at (default 1)" },
    limit: { type: "integer", description: "Max lines to return (default #{DEFAULT_LIMIT})" }
  }
  props[:compress] = compress_param if compression_enabled?
  { type: "object", properties: props, required: %w[file_path] }
end

#nameObject



20
21
22
# File 'lib/rubino/tools/read_tool.rb', line 20

def name
  "read"
end

#risk_levelObject



64
65
66
# File 'lib/rubino/tools/read_tool.rb', line 64

def risk_level
  :low
end