Class: Legate::Tools::ReadWebpage
- Inherits:
-
Legate::Tool
- Object
- Legate::Tool
- Legate::Tools::ReadWebpage
- Includes:
- Base::HttpClient
- Defined in:
- lib/legate/tools/read_webpage_tool.rb
Overview
Fetches a web page and returns its readable text content with markup removed.
This is the backbone of research/RAG agents: give it a URL and it returns the page title and plain text (script/style stripped, entities decoded, whitespace collapsed), capped to a sane size. SSRF-safe via Base::SafeUrl.
Constant Summary collapse
- DEFAULT_MAX_CHARS =
20_000- HARD_MAX_CHARS =
200_000- ENTITIES =
{ '&' => '&', '<' => '<', '>' => '>', '"' => '"', ''' => "'", ''' => "'", ' ' => ' ' }.freeze
Instance Attribute Summary
Attributes included from Base::HttpClient
Attributes inherited from Legate::Tool
#description, #name, #parameters
Instance Method Summary collapse
-
#initialize(**options) ⇒ ReadWebpage
constructor
A new instance of ReadWebpage.
Methods included from Base::HttpClient
#http_delete, #http_get, #http_head, #http_post, #http_put
Methods inherited from Legate::Tool
define_metadata, #execute, inherited, #validate_and_coerce_params, #validate_params
Methods included from Legate::Tool::MetadataDsl
Constructor Details
#initialize(**options) ⇒ ReadWebpage
Returns a new instance of ReadWebpage.
32 33 34 35 |
# File 'lib/legate/tools/read_webpage_tool.rb', line 32 def initialize(**) super(**) setup_http_client(base_url: 'https://placeholder.invalid') end |