Class: Rubino::Tools::WebFetchTool

Inherits:
Base
  • Object
show all
Defined in:
lib/rubino/tools/webfetch_tool.rb

Overview

Tool for fetching web page content and converting to text/markdown.

Constant Summary collapse

MAX_BODY_SIZE =
100_000
TIMEOUT =
30
READABILITY_MIN_RATIO =

Safety-fallback thresholds for readability extraction. If the main-content extraction yields suspiciously little text relative to the whole document (RATIO) or below an absolute floor (FLOOR), we assume the heuristic over-trimmed and fall back to the full-page legacy strip so we never hand back a near-empty page.

0.30
READABILITY_MIN_CHARS =
200
BOILERPLATE_TAGS =

Elements that are never main content. Dropped wholesale before extraction.

%w[
  script style noscript nav header footer aside form svg iframe button template
].freeze
BOILERPLATE_ROLES =

ARIA landmark roles that mark page chrome rather than content.

%w[navigation banner contentinfo search complementary].freeze

Instance Attribute Summary

Attributes inherited from Base

#cancel_token, #read_tracker, #stream_chunk, #stream_kind

Instance Method Summary collapse

Methods inherited from Base

#cancellation_requested?, #display_name, #emit_chunk, #mcp?, #risky?, #to_tool_definition, workspace_root, workspace_roots

Instance Method Details

#call(arguments) ⇒ Object



66
67
68
69
70
71
# File 'lib/rubino/tools/webfetch_tool.rb', line 66

def call(arguments)
  url = arguments["url"] || arguments[:url]
  format = arguments["format"] || arguments[:format] || "text"

  fetch_url(url, format: format)
end

#config_keyObject

Gated by ‘tools.web` (shared with websearch), not `tools.webfetch`.



35
36
37
# File 'lib/rubino/tools/webfetch_tool.rb', line 35

def config_key
  "web"
end

#descriptionObject



39
40
41
42
# File 'lib/rubino/tools/webfetch_tool.rb', line 39

def description
  "Fetch content from a URL and return it as text. " \
    "Useful for reading documentation, API references, and web pages."
end

#input_schemaObject



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# File 'lib/rubino/tools/webfetch_tool.rb', line 44

def input_schema
  {
    type: "object",
    properties: {
      url: {
        type: "string",
        description: "The URL to fetch content from"
      },
      format: {
        type: "string",
        enum: %w[text html],
        description: "Output format: 'text' (default, strips HTML) or 'html' (raw)"
      }
    },
    required: %w[url]
  }
end

#nameObject



30
31
32
# File 'lib/rubino/tools/webfetch_tool.rb', line 30

def name
  "webfetch"
end

#risk_levelObject



62
63
64
# File 'lib/rubino/tools/webfetch_tool.rb', line 62

def risk_level
  :low
end