browserctl logo

browserctl

CI Gem Version Downloads

A persistent browser automation daemon and CLI, purpose-built for AI agents and developer workflows.

Unlike tools that restart the browser on every script run, browserctl keeps a named browser session alive — preserving cookies, localStorage, open tabs, and page state across discrete commands.

browserd &                                           # start the daemon (headless)
browserctl open login --url https://example.com/login
browserctl snap login                                # AI-friendly JSON snapshot with ref IDs
browserctl fill login --ref e1 --value me@example.com   # interact by ref
browserctl click login --ref e2
browserctl shutdown

browserctl capturing a login flow

Login flow captured with browserctl shot


Why browserctl?

Most automation tools are stateless — every script spins up a fresh browser and tears it down. browserctl doesn't.

browserctl Playwright / Selenium
Session persists across commands ✗ (per-script lifecycle)
Named page handles
AI-friendly DOM snapshot
Lightweight CLI interface
Full browser automation API
Parallel multi-browser testing

Use browserctl when you need a browser that stays alive and remembers state — for AI agents, iterative dev workflows, or lightweight smoke tests.

Use Playwright/Selenium when you need parallel test suites, multi-browser support, or a full programmatic API.


Requirements

  • Ruby >= 3.2
  • Chrome or Chromium installed and on PATH

Installation

gem install browserctl

Or in your Gemfile:

gem "browserctl"

Quick Start

1. Start the daemon

browserd           # headless (default)
browserd --headed  # visible browser window

2. Open a named page

browserctl open login --url https://app.example.com/login

3. Snapshot the page to discover refs

browserctl snap login              # AI-friendly JSON with ref IDs (default)
browserctl snap login --format html

4. Interact using refs or selectors

browserctl fill  login --ref e1 --value user@example.com
browserctl fill  login --ref e2 --value s3cr3t
browserctl click login --ref e3

# or using explicit CSS selectors
browserctl fill  login "input[name=email]"    user@example.com
browserctl click login "button[type=submit]"

5. Observe the result

browserctl snap login --diff       # only changed elements since last snap
browserctl shot login --out /tmp/after-login.png --full
browserctl url  login

6. Manage pages and daemon

browserctl pages
browserctl close login
browserctl ping
browserctl shutdown

All Commands

Browser commands (require browserd running)

Command Description
open <page> [--url URL] Open or focus a named page
close <page> Close a named page
pages List open pages
goto <page> <url> Navigate a page to a URL
fill <page> <selector> <value> Fill an input field by CSS selector
fill <page> --ref <id> --value <v> Fill an input field by snapshot ref
click <page> <selector> Click an element by CSS selector
click <page> --ref <id> Click an element by snapshot ref
`snap [--format ai\ html] [--diff]`
watch <page> <selector> [--timeout N] Poll until selector appears (default timeout: 30s)
shot <page> [--out PATH] [--full] Take a screenshot
url <page> Print current URL
eval <page> <expression> Evaluate a JS expression
pause <page> Pause automation — browser stays live for manual interaction
resume <page> Resume automation after manual action
inspect <page> Open Chrome DevTools for a named page
cookies <page> List all cookies as JSON
set_cookie <page> <name> <value> <domain> Set a cookie (path defaults to /)
clear_cookies <page> Clear all cookies for a page
record start <name> Begin recording commands as a replayable workflow
record stop [--out path] End recording; saves to .browserctl/workflows/ or custom path
record status Show whether a recording is active

Daemon commands

Command Description
ping Check if browserd is alive
shutdown Stop browserd

Workflow commands

Command Description
`run <name\ file.rb> [--key value ...]`
workflows List available workflows
describe <name> Show workflow params and steps

AI Snapshot Format

browserctl snap <page> returns a compact JSON array of interactable elements — designed to be token-efficient for AI agents:

[
  {
    "ref": "e1",
    "tag": "input",
    "text": "",
    "selector": "form > input[name=email]",
    "attrs": {
      "type": "email",
      "name": "email",
      "placeholder": "Enter email"
    }
  },
  {
    "ref": "e2",
    "tag": "button",
    "text": "Sign in",
    "selector": "form > button",
    "attrs": {
      "type": "submit"
    }
  }
]

Use ref values directly with --ref for zero-fragility interactions, or use selector values with fill and click.

Ref-based interaction

After a snap, use ref IDs instead of CSS selectors — no selector knowledge required:

browserctl fill  login --ref e1 --value user@example.com
browserctl click login --ref e2

Diff snapshots

Track only what changed since the last snapshot — useful for AI agents monitoring async updates:

browserctl snap login --diff

Workflows

Workflows are Ruby files using the Browserctl.workflow DSL. Place them in any of:

  • ./.browserctl/workflows/
  • ~/.browserctl/workflows/

Example

# .browserctl/workflows/smoke_login.rb
Browserctl.workflow "smoke_login" do
  desc "Log in and confirm the dashboard loads"

  param :email,    required: true
  param :password, required: true, secret: true
  param :base_url, default: "https://app.example.com"

  step "open login page" do
    page(:login).goto("#{base_url}/login")
  end

  step "submit credentials" do
    page(:login).fill("input[name=email]",    email)
    page(:login).fill("input[name=password]", password)
    page(:login).click("button[type=submit]")
  end

  step "verify dashboard" do
    page(:login).wait_for("[data-test=dashboard]", timeout: 10)
    assert page(:login).url.include?("/dashboard")
  end
end
browserctl run smoke_login --email me@example.com --password s3cr3t

Workflow DSL reference

Method Description
desc "text" Human-readable description
param :name, required:, secret:, default: Declare a parameter
step "label" { } Add a step (runs in order, halts on failure)
step "label", retry_count: N, timeout: S { } Step with retry and/or timeout
page(:name) Returns a PageProxy for the named page
invoke "other_workflow", **overrides Call another workflow
assert condition, "message" Raise WorkflowError if condition is false

PageProxy methods

goto(url) · fill(selector, value) · click(selector) · snapshot(**opts) · screenshot(**opts) · wait_for(selector, timeout: 10) · url · evaluate(expression) · pause · resume · inspect_page · cookies · set_cookie(name, value, domain, path: "/") · clear_cookies


Examples

Ready-to-run smoke tests against the-internet.herokuapp.com are included in examples/the_internet/. See docs/smoke-testing-the-internet.md for annotated output and auto-generated screenshots of each scenario.

For a full guide on building your own workflows, see docs/writing-workflows.md.


How it works

browserd runs as a background process, listening on a Unix socket at ~/.browserctl/browserd.sock. Start multiple named instances for agent isolation:

browserd --name agent-a &
browserd --name agent-b &
browserctl --daemon agent-a open main --url https://app.example.com

It manages a Ferrum (Chrome DevTools Protocol) browser instance with named page handles.

browserctl sends JSON-RPC commands over the socket and prints the result. Workflows run in-process through the same client.

The daemon shuts itself down after 30 minutes of inactivity.


Development

git clone https://github.com/patrick204nqh/browserctl
cd browserctl
bin/setup              # install deps + check for Chrome

bundle exec rspec      # run tests
bundle exec rubocop    # lint

Contributing

See CONTRIBUTING.md · SECURITY.md

License

MIT