Class: Clacky::Tools::Browser
Overview
Browser tool — controls the user’s real Chromium-based browser (Chrome 146+) via the Chrome DevTools MCP server (chrome-devtools-mcp).
Architecture: uses the existing-session driver (Chrome MCP).
chrome-devtools-mcp --autoConnect --experimentalStructuredContent
--experimental-page-id-routing
Communication: MCP stdio JSON-RPC 2.0 over a persistent (daemon) process. The MCP server process is started once, kept alive across all tool calls, and only restarted when the process dies unexpectedly.
pageId is intentionally NOT passed to most MCP calls — the MCP server maintains its own selected page state. Only focus/close actions need pageId. When the selected page has been closed, mcp_call automatically retries once.
Constant Summary collapse
- MIN_CHROME_MAJOR =
146- MCP_HANDSHAKE_TIMEOUT =
10- MCP_CALL_TIMEOUT =
60- MIN_NODE_MAJOR =
20- MAX_SNAPSHOT_CHARS =
4000- MAX_LLM_OUTPUT_CHARS =
6000- BROWSER_CONFIG_PATH =
File.("~/.clacky/browser.yml").freeze
- BROWSER_DIAGNOSIS_HINT =
<<~HINT.strip.freeze Inform the user and ask if they'd like to run a diagnosis. If yes, invoke the browser-setup skill with subcommand "doctor". HINT
- BROWSER_NOT_CONNECTED_HINT =
Cause 1+2: Chrome not running, or Remote Debugging disabled (MCP can’t distinguish them)
<<~HINT.strip.freeze Chrome is not reachable. Possible causes: 1. Chrome is not running — ask the user to open Chrome. 2. Remote Debugging is disabled — enable via chrome://inspect/#remote-debugging. HINT
- BROWSER_DAEMON_HINT =
Cause 3: MCP daemon crashed or failed to start
<<~HINT.strip.freeze The browser MCP daemon crashed or failed to start. It may recover automatically on the next action. If it keeps failing, ask the user to restart Clacky. HINT
- BROWSER_RESTART_HINT =
Cause 4: Chrome long-session unresponsiveness
<<~HINT.strip.freeze Chrome has become unresponsive. This often happens after Chrome has been running for a long time. Ask the user to restart Chrome, then retry the action. HINT
- SCREENSHOT_MAX_WIDTH =
800- SCREENSHOT_MAX_BASE64_BYTES =
150_000- INTERACTIVE_ROLES =
Snapshot rendering
%w[ button link textbox checkbox radio select combobox menuitem option tab switch searchbox spinbutton slider menuitemcheckbox menuitemradio ].freeze
- STRUCTURAL_ROLES =
%w[ generic none presentation group region section ].freeze
- CONTENT_ROLES =
%w[ heading paragraph text statictext image img listitem term definition ].freeze
Instance Method Summary collapse
- #execute(action:, profile: nil, working_dir: nil, **opts) ⇒ Object
- #format_call(args) ⇒ Object
- #format_result(result) ⇒ Object
- #format_result_for_llm(result) ⇒ Object
Methods inherited from Base
#category, #description, #name, #parameters, #to_function_definition
Instance Method Details
#execute(action:, profile: nil, working_dir: nil, **opts) ⇒ Object
80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/clacky/tools/browser.rb', line 80 def execute(action:, profile: nil, working_dir: nil, **opts) bypass = action.to_s == "status" || (action.to_s == "act" && (opts[:kind] || opts["kind"]).to_s == "evaluate") unless bypass return browser_not_setup_error unless File.exist?(BROWSER_CONFIG_PATH) return browser_disabled_error unless browser_enabled? end execute_user_browser(action, opts) rescue StandardError => e { error: classify_browser_error(e) } end |
#format_call(args) ⇒ Object
92 93 94 95 |
# File 'lib/clacky/tools/browser.rb', line 92 def format_call(args) action = args[:action] || args["action"] || "browser" "browser(#{action})" end |
#format_result(result) ⇒ Object
97 98 99 100 101 |
# File 'lib/clacky/tools/browser.rb', line 97 def format_result(result) return "[Error] #{result[:error].to_s[0..80]}" if result[:error] return "[OK] #{result[:output].to_s.lines.size} lines" if result[:output] "[OK] Done" end |
#format_result_for_llm(result) ⇒ Object
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
# File 'lib/clacky/tools/browser.rb', line 103 def format_result_for_llm(result) return result if result[:error] action = result[:action].to_s if action == "screenshot" && result[:image_data] mime_type = result[:mime_type] || "image/png" image_data = result[:image_data] data_url = "data:#{mime_type};base64,#{image_data}" original_path = result[:original_path] compressed_path = result[:compressed_path] text = "Screenshot captured." if original_path || compressed_path text += "\n- Original (full resolution): #{original_path || 'unavailable'}" \ "\n- Compressed (800px, sent to AI): #{compressed_path || 'unavailable'}" end return [ { type: "text", text: text }, { type: "image_url", image_url: { url: data_url } } ] end output = result[:output].to_s output = compress_snapshot(output) if action == "snapshot" max_chars = action == "snapshot" ? MAX_SNAPSHOT_CHARS : MAX_LLM_OUTPUT_CHARS { action: action, success: result[:success], stdout: truncate_output(output, max_chars), profile: result[:profile] }.compact end |