browserctl logo

browserctl

CI Gem Version Downloads

A persistent browser automation daemon and CLI, purpose-built for AI agents and developer workflows.

Unlike tools that restart the browser on every script run, browserctl keeps a named browser session alive — preserving cookies, localStorage, open tabs, and page state across discrete commands.

browserd &                                           # start the daemon (headless)
browserctl open login --url https://example.com/login
browserctl fill login "input[name=email]" me@example.com
browserctl click login "button[type=submit]"
browserctl snap login                                # AI-friendly JSON snapshot
browserctl shutdown

browserctl capturing a login flow

Login flow captured with browserctl shot


Why browserctl?

Most automation tools are stateless — every script spins up a fresh browser and tears it down. browserctl doesn't.

browserctl Playwright / Selenium
Session persists across commands ✗ (per-script lifecycle)
Named page handles
AI-friendly DOM snapshot
Lightweight CLI interface
Full browser automation API
Parallel multi-browser testing

Use browserctl when you need a browser that stays alive and remembers state — for AI agents, iterative dev workflows, or lightweight smoke tests.

Use Playwright/Selenium when you need parallel test suites, multi-browser support, or a full programmatic API.


Requirements

  • Ruby >= 3.2
  • Chrome or Chromium installed and on PATH

Installation

gem install browserctl

Or in your Gemfile:

gem "browserctl"

Quick Start

1. Start the daemon

browserd           # headless (default)
browserd --headed  # visible browser window

2. Open a named page

browserctl open login --url https://app.example.com/login

3. Interact with the page

browserctl fill  login "input[name=email]"    user@example.com
browserctl fill  login "input[name=password]" s3cr3t
browserctl click login "button[type=submit]"

4. Observe the result

browserctl snap login              # AI-friendly JSON (default)
browserctl snap login --format html
browserctl shot login --out /tmp/after-login.png --full
browserctl url  login

5. Manage pages and daemon

browserctl pages
browserctl close login
browserctl ping
browserctl shutdown

All Commands

Browser commands (require browserd running)

Command Description
open <page> [--url URL] Open or focus a named page
close <page> Close a named page
pages List open pages
goto <page> <url> Navigate a page to a URL
fill <page> <selector> <value> Fill an input field
click <page> <selector> Click an element
`snap [--format ai\ html]`
shot <page> [--out PATH] [--full] Take a screenshot
url <page> Print current URL
eval <page> <expression> Evaluate a JS expression

Daemon commands

Command Description
ping Check if browserd is alive
shutdown Stop browserd

Workflow commands

Command Description
`run <name\ file.rb> [--key value ...]`
workflows List available workflows
describe <name> Show workflow params and steps

AI Snapshot Format

browserctl snap <page> returns a compact JSON array of interactable elements — designed to be token-efficient for AI agents:

[
  {
    "ref": "e1",
    "tag": "input",
    "text": "",
    "selector": "form > input[name=email]",
    "attrs": {
      "type": "email",
      "name": "email",
      "placeholder": "Enter email"
    }
  },
  {
    "ref": "e2",
    "tag": "button",
    "text": "Sign in",
    "selector": "form > button",
    "attrs": {
      "type": "submit"
    }
  }
]

Use selector values directly with fill and click.


Workflows

Workflows are Ruby files using the Browserctl.workflow DSL. Place them in any of:

  • ./.browserctl/workflows/
  • ~/.browserctl/workflows/

Example

# .browserctl/workflows/smoke_login.rb
Browserctl.workflow "smoke_login" do
  desc "Log in and confirm the dashboard loads"

  param :email,    required: true
  param :password, required: true, secret: true
  param :base_url, default: "https://app.example.com"

  step "open login page" do
    page(:login).goto("#{base_url}/login")
  end

  step "submit credentials" do
    page(:login).fill("input[name=email]",    email)
    page(:login).fill("input[name=password]", password)
    page(:login).click("button[type=submit]")
  end

  step "verify dashboard" do
    page(:login).wait_for("[data-test=dashboard]", timeout: 10)
    assert page(:login).url.include?("/dashboard")
  end
end
browserctl run smoke_login --email me@example.com --password s3cr3t

Workflow DSL reference

Method Description
desc "text" Human-readable description
param :name, required:, secret:, default: Declare a parameter
step "label" { } Add a step (runs in order, halts on failure)
page(:name) Returns a PageProxy for the named page
invoke "other_workflow", **overrides Call another workflow
assert condition, "message" Raise WorkflowError if condition is false

PageProxy methods

goto(url) · fill(selector, value) · click(selector) · snapshot(**opts) · screenshot(**opts) · wait_for(selector, timeout: 10) · url · evaluate(expression)


Examples

Ready-to-run smoke tests against the-internet.herokuapp.com are included in examples/the_internet/. See docs/smoke-testing-the-internet.md for annotated output and auto-generated screenshots of each scenario.

For a full guide on building your own workflows, see docs/writing-workflows.md.


How it works

browserd runs as a background process, listening on a Unix socket at ~/.browserctl/browserd.sock. It manages a Ferrum (Chrome DevTools Protocol) browser instance with named page handles.

browserctl sends JSON-RPC commands over the socket and prints the result. Workflows run in-process through the same client.

The daemon shuts itself down after 30 minutes of inactivity.


Development

git clone https://github.com/patrick204nqh/browserctl
cd browserctl
bin/setup              # install deps + check for Chrome

bundle exec rspec      # run tests
bundle exec rubocop    # lint

Contributing

See CONTRIBUTING.md · SECURITY.md

License

MIT