Class: Rubino::Tools::ReadAttachmentTool

Inherits:
Base
  • Object
show all
Defined in:
lib/rubino/tools/read_attachment_tool.rb

Overview

Gated, on-demand attachment reader (#6). Instead of every attachment’s bytes being inlined into the prompt by default, the model calls this tool only when it actually needs a document’s content – the single biggest reduction in prompt-injection surface from the attachment work.

Pipeline (reuses the audited primitives; invents nothing new):

1. Attachments::Classify.call (fail-closed: lstat -> realpath-confine to
   the workspace -> size cap -> magic-bytes-wins MIME). Only a safe,
   policy-allowed document/text proceeds.
2. Documents.to_markdown -- in-process conversion (pdf/docx/xlsx/pptx/
   html/csv/json/xml/plain). Returns nil when no in-process converter can
   handle the format (e.g. the optional gem isn't installed).
3. On nil: return the existing actionable shell-extraction hint
   (Preamble.document_shell_hint) -- NEVER raise, so a missing optional
   gem can't break a turn.
4. Oversized Markdown is routed through the existing map-reduce
   `summarize` aux (SummarizeFileTool) rather than dumped into context.
5. Inline-sized Markdown is wrapped in Preamble's nonce-framed untrusted
   envelope (converted document = untrusted user data).

Instance Attribute Summary collapse

Attributes inherited from Base

#cancel_token, #read_tracker, #stream_chunk

Instance Method Summary collapse

Methods inherited from Base

#cancellation_requested?, #emit_chunk, #risky?, #to_tool_definition, workspace_root, workspace_roots

Instance Attribute Details

#summarizer=(value) ⇒ Object

Test seam: inject a stub summarizer (a SummarizeFileTool-like object responding to #call). Production lazily builds the real tool.



74
75
76
# File 'lib/rubino/tools/read_attachment_tool.rb', line 74

def summarizer=(value)
  @summarizer = value
end

Instance Method Details

#call(arguments) ⇒ Object



76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/rubino/tools/read_attachment_tool.rb', line 76

def call(arguments)
  file_path = (arguments["file_path"] || arguments[:file_path]).to_s
  return "Error: file_path is required" if file_path.empty?

  # Classify runs the fail-closed safety pipeline (lstat rejects symlink/
  # FIFO/device, size cap, magic-bytes-wins MIME). We then confine to the
  # workspace via Base#within_workspace?, which checks ALL allowed roots
  # (primary + every --add-dir) and resolves symlinks -- a single
  # confine_dir can't express the multi-root sandbox the agent uses.
  cls = Attachments::Classify.call(file_path)
  unless cls.safe
    return "Error: cannot read #{file_path}: #{cls.reason}. " \
           "Attachments must be regular files inside the workspace, under the size cap."
  end
  return workspace_violation_message(file_path) unless within_workspace?(cls.path)
  unless Attachments::Policy.allow_kind?(cls.kind)
    return "Error: #{file_path} is a #{cls.kind} (#{cls.mime}); read_attachment only " \
           "reads documents and text. Inspect other kinds via the shell."
  end

  markdown = Rubino::Documents.to_markdown(cls.path, mime: cls.mime)
  # No in-process converter (unknown format / optional gem absent): degrade
  # with the actionable shell-extraction hint, exactly like the preamble.
  # NEVER raise -- a missing gem must not break the turn.
  return Attachments::Preamble.document_shell_hint(cls) if markdown.nil?

  force = truthy?(arguments["summarize"] || arguments[:summarize])
  focus = (arguments["focus"] || arguments[:focus]).to_s

  if force || oversized?(markdown)
    summarize(cls, markdown, focus)
  else
    frame(cls, markdown)
  end
rescue Rubino::Interrupted
  raise
rescue StandardError => e
  # Total failure still degrades gracefully -- the model gets the
  # shell-hint and the turn survives.
  Rubino.logger&.warn(event: "read_attachment.failed", path: file_path, error: e.class.to_s)
  begin
    Attachments::Preamble.document_shell_hint(
      Attachments::Classification.new(path: file_path, kind: :document,
                                      mime: nil, size_bytes: nil, safe: true, reason: nil)
    )
  rescue StandardError
    "Error: could not read #{file_path}: #{e.class}."
  end
end

#config_keyObject



31
32
33
# File 'lib/rubino/tools/read_attachment_tool.rb', line 31

def config_key
  "read_attachment"
end

#descriptionObject



35
36
37
38
39
40
41
42
43
# File 'lib/rubino/tools/read_attachment_tool.rb', line 35

def description
  "Read an attached document on demand, converting it to Markdown IN-PROCESS " \
    "(PDF, DOCX, XLSX, PPTX, HTML, CSV, JSON, XML, plain/code) and returning the " \
    "text framed as untrusted user data. Prefer this over shelling out to " \
    "`markitdown`/`pdftotext`. Pass the path the attachment was staged at. Large " \
    "documents are automatically summarized via a separate model instead of " \
    "flooding this conversation. If the format has no in-process converter, you " \
    "get an actionable shell-extraction hint instead."
end

#input_schemaObject



45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/rubino/tools/read_attachment_tool.rb', line 45

def input_schema
  {
    type: "object",
    properties: {
      file_path: {
        type: "string",
        description: "Path to the attachment to read (absolute or workspace-relative)."
      },
      summarize: {
        type: "boolean",
        description: "Force routing through the summarization model even if the " \
                     "document fits inline. Optional; oversized documents are " \
                     "summarized automatically regardless."
      },
      focus: {
        type: "string",
        description: "When summarizing, what the summary must preserve. Optional."
      }
    },
    required: %w[file_path]
  }
end

#nameObject



27
28
29
# File 'lib/rubino/tools/read_attachment_tool.rb', line 27

def name
  "read_attachment"
end

#risk_levelObject



68
69
70
# File 'lib/rubino/tools/read_attachment_tool.rb', line 68

def risk_level
  :low
end