Class: Clacky::Tools::Browser

Inherits:
Base
  • Object
show all
Defined in:
lib/clacky/tools/browser.rb

Overview

Browser tool — controls the user’s real Chromium-based browser (Chrome 146+) via the Chrome DevTools MCP server (chrome-devtools-mcp).

Architecture: uses the existing-session driver (Chrome MCP).

chrome-devtools-mcp --autoConnect --experimentalStructuredContent
    --experimental-page-id-routing

Communication: MCP stdio JSON-RPC 2.0 over a persistent (daemon) process. The MCP server process is started once, kept alive across all tool calls, and only restarted when the process dies unexpectedly.

pageId is intentionally NOT passed to most MCP calls — the MCP server maintains its own selected page state. Only focus/close actions need pageId. When the selected page has been closed, mcp_call automatically retries once.

Constant Summary collapse

MIN_CHROME_MAJOR =
146
MCP_HANDSHAKE_TIMEOUT =
10
MCP_CALL_TIMEOUT =
60
MIN_NODE_MAJOR =
20
MAX_SNAPSHOT_CHARS =
8000
MAX_LLM_OUTPUT_CHARS =
6000
SNAPSHOT_QUERY_WINDOW =

lines around a query hit

60
RETRYABLE_PAGE_ERRORS =

Errors that mean “the selected/active page is gone” — auto-retry once.

[
  "selected page has been closed",
  "No page found",
  "no active page",
  "Target closed",
  "page is detached"
].freeze
BROWSER_CONFIG_PATH =
File.expand_path("~/.clacky/browser.yml").freeze
BROWSER_DIAGNOSIS_HINT =
<<~HINT.strip.freeze
  Inform the user and ask if they'd like to run a diagnosis.
  If yes, invoke the browser-setup skill with subcommand "doctor".
HINT
BROWSER_NOT_CONNECTED_HINT =

Cause 1+2: Chrome not running, or Remote Debugging disabled (MCP can’t distinguish them)

<<~HINT.strip.freeze
  Chrome is not reachable. Possible causes:
  1. Chrome is not running — ask the user to open Chrome.
  2. Remote Debugging is disabled — enable via chrome://inspect/#remote-debugging.
HINT
BROWSER_DAEMON_HINT =

Cause 3: MCP daemon crashed or failed to start

<<~HINT.strip.freeze
  The browser MCP daemon crashed or failed to start. It may recover automatically on the next action.
  If it keeps failing, ask the user to restart Clacky.
HINT
BROWSER_RESTART_HINT =

Cause 4: Chrome long-session unresponsiveness

<<~HINT.strip.freeze
  Chrome has become unresponsive. This often happens after Chrome has been running for a long time.
  Ask the user to restart Chrome, then retry the action.
HINT
SCREENSHOT_MAX_WIDTH =
800
SCREENSHOT_MAX_BASE64_BYTES =
150_000
INTERACTIVE_ROLES =

Snapshot rendering


%w[
  button link textbox checkbox radio select combobox
  menuitem option tab switch searchbox spinbutton
  slider menuitemcheckbox menuitemradio
].freeze
STRUCTURAL_ROLES =
%w[
  generic none presentation group region section
].freeze
CONTENT_ROLES =
%w[
  heading paragraph text statictext image img
  listitem term definition
].freeze

Instance Method Summary collapse

Methods inherited from Base

#category, #description, #name, #parameters, #to_function_definition

Instance Method Details

#execute(action:, profile: nil, working_dir: nil, **opts) ⇒ Object



94
95
96
97
98
99
100
101
102
103
104
# File 'lib/clacky/tools/browser.rb', line 94

def execute(action:, profile: nil, working_dir: nil, **opts)
  bypass = action.to_s == "status" ||
           (action.to_s == "act" && (opts[:kind] || opts["kind"]).to_s == "evaluate")
  unless bypass
    return browser_not_setup_error unless File.exist?(BROWSER_CONFIG_PATH)
    return browser_disabled_error  unless browser_enabled?
  end
  execute_user_browser(action, opts)
rescue StandardError => e
  { error: classify_browser_error(e) }
end

#format_call(args) ⇒ Object



106
107
108
109
# File 'lib/clacky/tools/browser.rb', line 106

def format_call(args)
  action = args[:action] || args["action"] || "browser"
  "browser(#{action})"
end

#format_result(result) ⇒ Object



111
112
113
114
115
# File 'lib/clacky/tools/browser.rb', line 111

def format_result(result)
  return "[Error] #{result[:error].to_s[0..80]}" if result[:error]
  return "[OK] #{result[:output].to_s.lines.size} lines" if result[:output]
  "[OK] Done"
end

#format_result_for_llm(result) ⇒ Object



117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
# File 'lib/clacky/tools/browser.rb', line 117

def format_result_for_llm(result)
  return result if result[:error]

  action = result[:action].to_s

  if action == "screenshot" && result[:image_data]
    mime_type       = result[:mime_type] || "image/png"
    image_data      = result[:image_data]
    original_path   = result[:original_path]
    compressed_path = result[:compressed_path]

    text = "Screenshot captured."
    if original_path || compressed_path
      text += "\n- Original (full resolution): #{original_path || 'unavailable'}" \
              "\n- Compressed (800px, sent to AI): #{compressed_path || 'unavailable'}"
    end

    return {
      content_string: text,
      image_inject: {
        mime_type: mime_type,
        base64_data: image_data,
        path: compressed_path || original_path
      }
    }
  end

  output = result[:output].to_s
  output = compress_snapshot(output) if action == "snapshot"
  max_chars = action == "snapshot" ? MAX_SNAPSHOT_CHARS : MAX_LLM_OUTPUT_CHARS

  {
    action:  action,
    success: result[:success],
    stdout:  truncate_output(output, max_chars),
    profile: result[:profile]
  }.compact
end