Module: AgentSandbox::BrowserTools

Defined in:
lib/agent_sandbox/browser_tools.rb

Overview

RubyLLM tool adapters for Vercel’s ‘agent-browser` CLI running inside a sandbox. The sandbox image must have agent-browser + a chromium- compatible browser installed (see docker/browser.Dockerfile).

sandbox = AgentSandbox.new(backend: :docker, image: "agent-sandbox-browser",
                           hardened: false, memory: "2g")
chat = RubyLLM.chat(model: "gpt-4o-mini")
chat.with_tools(*AgentSandbox.browser_tools(sandbox))

Pass ‘vision_model:` to override the model used by `screenshot` and `read_image` (those tools take a second LLM hop to extract text from the image). Default is ENV or “gpt-5”.

The ‘agent-browser` daemon persists browser state (tabs, cookies) across invocations, so each tool call reuses the same Chrome session.

Defined Under Namespace

Modules: VisionSupport Classes: Back, Base, Click, Eval, Fill, GetText, Open, ReadImage, Reload, Screenshot, Snapshot, Wait

Class Method Summary collapse

Class Method Details

.build(sandbox, vision_model: nil) ⇒ Object



24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# File 'lib/agent_sandbox/browser_tools.rb', line 24

def self.build(sandbox, vision_model: nil)
  vm = vision_model || ENV["AGENT_SANDBOX_VISION_MODEL"] || "gpt-5"
  [
    Open.new(sandbox),
    Snapshot.new(sandbox),
    Click.new(sandbox),
    Fill.new(sandbox),
    GetText.new(sandbox),
    Wait.new(sandbox),
    Back.new(sandbox),
    Reload.new(sandbox),
    Screenshot.new(sandbox, vision_model: vm),
    Eval.new(sandbox),
    ReadImage.new(sandbox, vision_model: vm)
  ]
end