rubino

A coding & automation agent — small, self-contained, and built to run where the work is: directly on your machine or inside a VM. You drop it onto a box and it works there, reachable over a CLI and an HTTP API. It is not a heavy framework; it's a lightweight agent with persistent memory, sessions, and context compaction. Built on ruby_llm.

Why rubino

Runs where the work is — a single gem on the machine (or VM) that holds the code, not a remote service you pipe files to.
Persistent memory — a tiny SQLite fact store that learns about you and the project across sessions.
Context compaction — automatic compression with session lineage when the conversation outgrows the window.
CLI and HTTP API — an interactive terminal session for humans, a bearer-protected JSON + SSE API for programs.
Real tools, gated — read/write/edit, shell, ruby, grep/glob, apply_patch, vision, and more (git, GitHub, and tests run through the hardened shell), behind an approval model with a non-bypassable hardline floor.
Built on ruby_llm — provider-agnostic: MiniMax, OpenAI, Anthropic, Gemini, or an OpenAI-compatible gateway.

Cache-friendly compaction (measured)

A long agent session only stays cheap if the cached prompt prefix survives compaction. rubino is built so that when the conversation is compressed into a summary, the summary lands after the cached head (system + tools + stable history) — so the provider's prompt cache keeps hitting the head instead of re-encoding it cold every time the session is compacted.

Measured with the model held fixed (local oMLX Qwen3.6-35B-A3B, Anthropic-style cache_control) on a 25-turn coding session that triggers compaction 9 times:

metric	rubino
cached prefix retained right after each compaction	44–94% (survives — never resets to 0)
cumulative cache-read over the whole session	88%
prefix byte-stability across turns	0.95
task solved through all 9 compactions	10/10 hidden tests, 0 wasted work

Holding the model fixed isolates the engine — any difference is the scaffolding (prompt assembly, where the compaction summary is placed, cache breakpoints), not the model. This is a single model and a single scenario: indicative of the design, not a leaderboard. The harness lives in a separate benchmark project.

Tool-output compression (measured)

Test logs, diffs and large command dumps are mostly noise. rubino can route each tool output through a deterministic (no-ML) compressor that keeps the signal and drops the rest — opt-in (tool_output_compression), with a byte-identical passthrough for anything already small and a retrieve_output pointer back to the full text. Token-honest: counts are the exact prompt_tokens reported by the server (local oMLX Qwen3.6-35B-A3B), not chars/4 estimates.

tool output	reduction	fidelity (verified)
rspec full suite (21 failures, ~8k lines)	97%	all 21 failures + the tally kept
`git log --stat` / `ls -R`	94%	boundary/keyword lines kept
large source diff (9 files)	42%	all 575 ± lines, 13 hunks, 9 headers
`package-lock.json` diff (60 bumps)	99%	file header + summary (body elided)
whole-file Ruby read → skeleton	27%	signatures + structure kept
JSON (kubectl / docker / gh, uniform rows)	40–88%	error rows + outliers always kept
rubocop (already signal-dense)	11%	floor — every offense kept

End-to-end A/B on real edit tasks: 12/12 tasks passed with compression ON and OFF — it never broke a task, and every forced-failure run still recovered the single failing line out of a long log. Routing is verified (each output goes to the right strategy) and small inputs pass through byte-identical.

Install

One line, Linux and macOS (x86_64 / arm64). Installs a compatible Ruby, then the gem — all in user space, no sudo:

curl -fsSL https://raw.githubusercontent.com/Jhonnyr97/rubino-agent/main/install.sh | bash

The installer supports three methods for getting a compatible Ruby + the gem:

rv (rv) — fetches a precompiled Ruby into user space.
Homebrew (brew install ruby) — offered on macOS when Homebrew is present.
mise (mise) — a polyglot tool manager; installs rubino via its gem: backend and pins the latest published gem version.

On macOS (interactive) you're asked to pick Homebrew / rv / mise; on Linux (interactive) you pick rv / mise (Homebrew is offered only if brew is already on PATH). Skip the prompt with RUBINO_INSTALL_METHOD=brew, =rv, or =mise. For the mise method, RUBINO_INSTALL_SCOPE=global (default, user-wide ~/.config/mise/config.toml) or =local (this directory only, ./mise.toml) chooses the scope.

On Debian 12 / old-glibc systems rv would install a musl Ruby this glibc box can't execute; the installer detects that and steers you from rv to mise (precompiled, glibc-correct) so you don't land on a broken rubino.

Review before you pipe. Piping a script into your shell runs whatever it contains. Read it first:
curl -fsSL https://raw.githubusercontent.com/Jhonnyr97/rubino-agent/main/install.sh -o install.sh
less install.sh && bash install.sh

The installer is idempotent — safe to re-run. It persists the activation / PATH line to your shell rc (.zshrc / .bashrc / .profile) and then runs a fresh-shell verification gate — it opens a clean login shell and fails loudly if rubino isn't on PATH there, instead of merely printing a hint you might miss. Opt out of any rc modification with RUBINO_NO_MODIFY_RC=1 (the installer then prints the line for you to add yourself).

Manual install (if you'd rather not pipe, or already manage Ruby yourself):

# With rv (https://rv.dev):
curl -LsSf https://rv.dev/install | sh
rv ruby install 3.3.3
rv run --ruby 3.3.3 gem install rubino-agent

# Or with any Ruby >= 3.1 already on your PATH:
gem install rubino-agent

Quick Start

rubino setup        # guided first-run: pick a provider, paste a key
rubino chat         # start chatting; ask "what does this project do?"

rubino setup runs an interactive wizard that picks a provider/model and stores your API key — no hand-editing of YAML to get a first answer. If you skip the wizard, a bare rubino chat from a fresh home launches it for you before the first message.

New here? Read docs/getting-started.md — install → setup → first working message.

In development:

git clone https://github.com/Jhonnyr97/rubino-agent.git
cd rubino-agent
bundle install
bundle exec rubino setup
bundle exec rubino chat

Requirements

Ruby >= 3.1
SQLite3
An LLM provider API key (MiniMax, OpenAI, Anthropic, or Google) — or any OpenAI-compatible gateway.

Essential commands

Command	What it does
`rubino setup`	Guided first-run: provider/model/key, config + database
`rubino chat`	Interactive session (bare `chat` auto-resumes your last session)
`rubino chat --new`	Start a fresh session instead of resuming
`rubino prompt "..."`	One-shot, non-interactive (alias for `chat -q`)
`rubino server`	Start the JSON API + SSE server
`rubino doctor`	Check config, credentials, and database health
`rubino tools`	List tools and their enabled/disabled state
`rubino memory list`	Inspect stored memories (uses the active backend)
`rubino version`	Print the version
`rubino update`	Update to the latest published version via RubyGems

rubino update runs gem update rubino-agent under the active interpreter (multi-Ruby safe); for a source/dev checkout it points you back at the installer instead. On interactive boot rubino shows a single dim line when a newer version is available (▸ rubino vX.Y available — run \rubino update`). The check is cached and refreshed out-of-band (once / 24h, short timeout), so it never slows startup and is silent offline. SetRUBINO_NO_UPDATE_CHECK=1to disable it (it is also off when not a TTY or underCI`). It prints nothing until the gem is actually published. See docs/commands.md.

Inside a chat, type /help for the slash commands (/status, /sessions, /memory, /agents, /skills, /mode, /commands, /new, …). The full reference is docs/commands.md.

Configuration

Configuration lives in ~/.rubino/config.yml (created by rubino setup); secrets go in ~/.rubino/.env. Both follow RUBINO_HOME if set. A representative slice (defaults shown):

model:
  default: "openai/gpt-4.1"   # the shipped default — see the note below
  provider: "auto"            # auto | openai | anthropic | bedrock | gemini | minimax | gateway
  temperature: 0.3

agent:
  max_turns: 90
  max_tool_iterations: 25

memory:
  enabled: true
  backend: "sqlite"           # SQLite FTS5 + graph-lite recall (default)
  auto_extract: true

compression:
  enabled: true
  threshold: 0.50

jobs:
  mode: "inline"              # inline | manual | worker

tools:
  workspace_strict: true      # sandbox write/edit/delete to the workspace
  git: true
  shell: true                 # ON by default; every command is still approval-gated
  ruby: true
  web: true                   # ON by default (keyless DuckDuckGo backend); gates BOTH webfetch and websearch
  memory: true

Heads-up on the default model. The shipped model.default is openai/gpt-4.1, which ruby_llm's registry resolves to OpenRouter — so a first run with no OpenAI/OpenRouter key fails fast with guidance instead of hanging. Run rubino setup (the wizard defaults to OpenAI gpt-4.1) or set your provider/key explicitly. See docs/models-and-keys.md.

Full reference (every key, env vars, precedence): docs/configuration.md.

Documentation

Getting started — install → setup → first message
Models & keys — which provider/model/key, per-provider setup blocks
Commands — CLI subcommands + slash-command reference
Configuration — full config + env vars + precedence
Tools — the built-in tool set and approval behavior
Skills — reusable instruction packs, the 3-level disclosure, and SKILL_LOADED observability
Memory — the SQLite memory backend
Security — approval model, hardline floor, TLS
Troubleshooting — keyed on the exact error strings
HTTP API · Jobs & cron · OAuth providers · Architecture
Contributing · Changelog

Built-in tools

The agent ships 27 built-in tools (the set rubino tools lists): read, read_attachment, summarize_file, write, edit, multi_edit, apply_patch, grep, glob, git, github, shell, shell_output, shell_tail, shell_input, shell_kill, ruby, run_tests, web, question, todowrite, memory, session_search, attach_file, vision, skill, task. A single web tool gates both fetching a URL and searching (config key tools.web, on by default via the keyless DuckDuckGo backend; it degrades gracefully when no search backend is reachable). Each tool is gated by a tools.<key> config flag (opt-out) and the approval model. See docs/tools.md.

Skills

Skills are reusable instruction packs (a SKILL.md plus optional bundled reference files) that the agent pulls into context only when relevant — it sees a short index of available skills up front (Level 1), loads a skill's full body on demand (Level 2, which emits the SKILL_LOADED signal), and reads bundled references when needed (Level 3). They live in .rubino/skills / ~/.rubino/skills, are gated by tools.skill, and expose usage/creation metrics. See docs/skills.md.

Fake LLM provider

rubino ships a built-in fake LLM provider for tests, demos, and integration harnesses. Unlike a mocked adapter, the fake provider plugs into the real Agent::Loop, the real ToolExecutor, the real approvals/clarifications pipeline, and the real SSE stream — scenarios fake only what an LLM would produce (content / thinking chunks and tool_call requests). Downstream consumers hit the same surface they would against OpenAI/Anthropic, at zero token cost.

model:
  default: "fake/happy-path"

providers:
  fake:
    scenarios_dir: "~/.rubino/scenarios"  # optional; defaults to built-in

Any model_id starting with fake is auto-routed to the fake provider. Because it can short-circuit tool decisions, it is disabled by default in server and chat — set RUBINO_ALLOW_FAKE=1 to opt in. Production deployments must never set this.

HTTP API

Start the bearer-protected JSON API server:

export RUBINO_API_KEY="$(openssl rand -hex 32)"
export RUBINO_ENCRYPTION_KEY="$(openssl rand -base64 32)"   # required for OAuth routes
rubino server --port 4820

Every request carries Authorization: Bearer <RUBINO_API_KEY> except GET /v1/health and GET /v1/metrics. The server binds 127.0.0.1 by default — pass --host 0.0.0.0 (or set RUBINO_API_HOST) to expose it, and only do so behind TLS or a trusted segment. For a remote HTTP client the API can serve over a self-signed cert the client pins (RUBINO_TLS=1; read it with rubino tls-cert).

Full request/response shapes, the error envelope, and SSE replay are in docs/api/v1.md.

Planned / on the roadmap

These are designed-in but not fully wired yet — don't depend on them in production:

MCP Support — connect to Model Context Protocol servers via ruby_llm-mcp (docs/mcp.md).
Multi-Agent — Build / Plan / Explore agents with @mention routing (docs/agents.md).

Development

bundle install
bundle exec rspec               # run tests (sequential, with coverage)
bundle exec rake parallel:spec  # run tests across all CPU cores
bundle exec rubino doctor   # verify setup

See CONTRIBUTING.md for the full dev/test/release flow.

License

MIT.