Class: Clacky::Tools::Browser
Overview
Browser tool — controls the user’s real Chromium-based browser (Chrome 146+) via the Chrome DevTools MCP server (chrome-devtools-mcp).
Architecture: uses the existing-session driver (Chrome MCP).
chrome-devtools-mcp --autoConnect --experimentalStructuredContent
--experimental-page-id-routing
Communication: MCP stdio JSON-RPC 2.0 over a persistent (daemon) process. The MCP server process is started once, kept alive across all tool calls, and only restarted when the process dies unexpectedly.
pageId is intentionally NOT passed to most MCP calls — the MCP server maintains its own selected page state. Only focus/close actions need pageId. When the selected page has been closed, mcp_call automatically retries once.
Constant Summary collapse
- MIN_CHROME_MAJOR =
146- MCP_HANDSHAKE_TIMEOUT =
10- MCP_CALL_TIMEOUT =
60- MIN_NODE_MAJOR =
20- MAX_SNAPSHOT_CHARS =
4000- MAX_LLM_OUTPUT_CHARS =
6000- BROWSER_CONFIG_PATH =
File.("~/.clacky/browser.yml").freeze
- BROWSER_DIAGNOSIS_HINT =
<<~HINT.strip.freeze Inform the user and ask if they'd like to run a diagnosis. If yes, invoke the browser-setup skill with subcommand "doctor". HINT
- BROWSER_NOT_CONNECTED_HINT =
Cause 1+2: Chrome not running, or Remote Debugging disabled (MCP can’t distinguish them)
<<~HINT.strip.freeze Chrome is not reachable. Possible causes: 1. Chrome is not running — ask the user to open Chrome. 2. Remote Debugging is disabled — enable via chrome://inspect/#remote-debugging. HINT
- BROWSER_DAEMON_HINT =
Cause 3: MCP daemon crashed or failed to start
<<~HINT.strip.freeze The browser MCP daemon crashed or failed to start. It may recover automatically on the next action. If it keeps failing, ask the user to restart Clacky. HINT
- BROWSER_RESTART_HINT =
Cause 4: Chrome long-session unresponsiveness
<<~HINT.strip.freeze Chrome has become unresponsive. This often happens after Chrome has been running for a long time. Ask the user to restart Chrome, then retry the action. HINT
- SCREENSHOT_MAX_WIDTH =
800- SCREENSHOT_MAX_BASE64_BYTES =
150_000- INTERACTIVE_ROLES =
Snapshot rendering
%w[ button link textbox checkbox radio select combobox menuitem option tab switch searchbox spinbutton slider menuitemcheckbox menuitemradio ].freeze
- STRUCTURAL_ROLES =
%w[ generic none presentation group region section ].freeze
- CONTENT_ROLES =
%w[ heading paragraph text statictext image img listitem term definition ].freeze
Instance Method Summary collapse
- #execute(action:, profile: nil, working_dir: nil, **opts) ⇒ Object
- #format_call(args) ⇒ Object
- #format_result(result) ⇒ Object
- #format_result_for_llm(result) ⇒ Object
Methods inherited from Base
#category, #description, #name, #parameters, #to_function_definition
Instance Method Details
#execute(action:, profile: nil, working_dir: nil, **opts) ⇒ Object
128 129 130 131 132 133 134 135 136 137 138 |
# File 'lib/clacky/tools/browser.rb', line 128 def execute(action:, profile: nil, working_dir: nil, **opts) bypass = action.to_s == "status" || (action.to_s == "act" && (opts[:kind] || opts["kind"]).to_s == "evaluate") unless bypass return browser_not_setup_error unless File.exist?(BROWSER_CONFIG_PATH) return browser_disabled_error unless browser_enabled? end execute_user_browser(action, opts) rescue StandardError => e { error: classify_browser_error(e) } end |
#format_call(args) ⇒ Object
140 141 142 143 |
# File 'lib/clacky/tools/browser.rb', line 140 def format_call(args) action = args[:action] || args["action"] || "browser" "browser(#{action})" end |
#format_result(result) ⇒ Object
145 146 147 148 149 |
# File 'lib/clacky/tools/browser.rb', line 145 def format_result(result) return "[Error] #{result[:error].to_s[0..80]}" if result[:error] return "[OK] #{result[:output].to_s.lines.size} lines" if result[:output] "[OK] Done" end |
#format_result_for_llm(result) ⇒ Object
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
# File 'lib/clacky/tools/browser.rb', line 151 def format_result_for_llm(result) return result if result[:error] action = result[:action].to_s if action == "screenshot" && result[:image_data] mime_type = result[:mime_type] || "image/png" image_data = result[:image_data] data_url = "data:#{mime_type};base64,#{image_data}" original_path = result[:original_path] compressed_path = result[:compressed_path] text = "Screenshot captured." if original_path || compressed_path text += "\n- Original (full resolution): #{original_path || 'unavailable'}" \ "\n- Compressed (800px, sent to AI): #{compressed_path || 'unavailable'}" end return [ { type: "text", text: text }, { type: "image_url", image_url: { url: data_url } } ] end output = result[:output].to_s output = compress_snapshot(output) if action == "snapshot" max_chars = action == "snapshot" ? MAX_SNAPSHOT_CHARS : MAX_LLM_OUTPUT_CHARS { action: action, success: result[:success], stdout: truncate_output(output, max_chars), profile: result[:profile] }.compact end |