Module: Rubino::Config::Defaults
- Defined in:
- lib/rubino/config/defaults.rb
Overview
Default configuration values for the entire system. These mirror the Rich config structure adapted for Ruby.
Constant Summary collapse
- DEFAULT_DATABASE_PATH =
Sentinel for the default database path. When config still carries this value, Configuration#database_path resolves it against the resolved home (RUBINO_HOME) instead of a literal ~/.rubino (issue #96).
"<RUBINO_HOME>/rubino.sqlite3"- HOME_COMMANDS_PATH =
Sentinel for the user-home commands directory. Resolved at read time (Commands::Loader/Executor) against the resolved home (RUBINO_HOME) instead of a literal ~/.rubino so commands in a custom home are actually discovered (issue #38).
"<RUBINO_HOME>/commands"- MODULE_DEFAULTS =
{ "model" => { # Public-gem default is OpenAI gpt-4.1 (maintainer directive): it is # the most broadly available provider and needs no special provider # block to route — just OPENAI_API_KEY — so a defaults-only config is # coherent and the first turn works. MiniMax stays an AVAILABLE wizard # choice but is NOT the seeded/recommended default. The onboarding # wizard's recommended (first) entry mirrors this exact default. # provider "auto" derives the concrete provider from the model id # (openai/* → openai); the wizard/auto-detect write an explicit # provider when the user/env picks a non-OpenAI backend. "default" => "openai/gpt-4.1", "provider" => "auto", "context_length" => nil, # nil = inherit the provider default (Hermes injects no temperature). # 0.3 used to be hardcoded but is inert under thinking-on (forced to 1) # and only surfaced when thinking was disabled (#414). "temperature" => nil, # Max output tokens for the anthropic-family path (anthropic_compatible # MiniMax, native anthropic, bedrock). ruby_llm defaults the Anthropic # max_tokens to 4096, which a reasoning model can exhaust on thinking # tokens alone → empty visible text. nil = use the adapter default # (16384). providers.<name>.max_tokens overrides per-backend. "max_tokens" => nil, # Thinking/reasoning token budget for the anthropic-family path. nil = # adapter default (8000, the reference "medium"). 0 disables thinking. # providers.<name>.thinking_budget overrides per-backend. "thinking_budget" => nil, # Visible-output headroom (tokens) reserved on top of the thinking # budget so the model can think AND answer. Mirrors the reference +4096. "max_tokens_text_headroom" => 4096, # nil = auto-detect from model_id via LLM::ContentBuilder.supports_vision?. # Set to true/false to override (e.g. when running behind a gateway that # hides the real upstream model name, like the gateway provider's `auto`). "supports_vision" => nil }, "providers" => { "openai" => { "base_url" => nil, # Per-READ socket inactivity (resets on every streamed chunk), NOT a # total — this is the agent's first-token + inter-token idle bound, # same as the OpenAI/Anthropic SDK default. A silent socket fails # within this window and is retried pre-first-token. Raise it for a # large local Ollama that cold-loads for minutes before token #1. "request_timeout_seconds" => 600, "stale_timeout_seconds" => 300, # Free-form hash merged verbatim into the OpenAI-style # /v1/chat/completions request body (top level). Only honored on the # OpenAI-compatible request path; ignored on the anthropic-family # path. Empty ⇒ byte-identical request to before. See the gateway # block below and docs/configuration.md for the canonical example. "extra_body" => {} }, "anthropic" => { "base_url" => nil, "request_timeout_seconds" => 600 }, "bedrock" => { "region" => "us-east-1", "request_timeout_seconds" => 600 }, "gemini" => { "request_timeout_seconds" => 600 }, # Opt-in provider for an OpenAI-compatible gateway. Point it at any # gateway that exposes an OpenAI-style /v1/* API: set base_url and # api_key and the agent routes everything here regardless of model id. # The gateway decides which upstream (OpenAI/Anthropic/…) and model # to call. Set model.provider: "gateway" to enable. "gateway" => { "openai_compatible" => true, "assume_model_exists" => true, "base_url" => nil, "request_timeout_seconds" => 600, # Free-form hash merged verbatim into the OpenAI-style # /v1/chat/completions request body (deep-merged at the top level via # ruby_llm's with_params). Use it to pass provider-specific knobs the # adapter does not model natively. The canonical case is suppressing # chain-of-thought leakage on oMLX / Qwen-style backends that emit # <think> text instead of native tool_calls unless the request # carries: # # providers: # gateway: # extra_body: # chat_template_kwargs: # enable_thinking: false # # Only applied on the OpenAI-compatible path; never on the # anthropic-family path, and never touches the thinking-budget logic. # Empty ⇒ byte-identical request to before. "extra_body" => {} } }, "auxiliary" => { "compression" => { "provider" => "main", "model" => "", "base_url" => nil, "timeout" => 120 }, "approval" => { "provider" => "main", "model" => "", "base_url" => nil, "timeout" => 30 }, # Multimodal aux. When set, the `vision` tool delegates here so a # text-only primary can still "see" an image. `provider: "main"` # reuses the primary's provider/base_url; otherwise both can be # overridden. Set `model: "auto-vision"` to let the gateway proxy # pick a vision model from the model catalog. "vision" => { "provider" => "main", "model" => "", "base_url" => nil, "timeout" => 120 }, # Document summarization. The `summarize_file` tool delegates here so # the raw bytes of a huge file are map-reduced in these aux calls and # never enter the main agent context (only the final summary returns). # `provider: "main"` reuses the primary's provider/model. "summarize" => { "provider" => "main", "model" => "", "base_url" => nil, "timeout" => 300 }, # Session titling. Deterministic by default (#103): a session is # titled by Session::Repository.derive_title with NO model call. When # this block names a CONCRETE aux backend distinct from the primary # (a non-"main" provider OR a non-empty model), the first user message # is instead summarized into a short title via the aux LLM (#45), # falling back to the deterministic title on any error/empty result. # At the defaults below (provider:"main", model:"") nothing is # "configured", so titling stays deterministic. "title" => { "provider" => "main", "model" => "", "base_url" => nil, "timeout" => 30 } }, "agent" => { # OUTER rail on tool iterations, enforced in IterationBudget alongside # max_tool_iterations (#414): the budget caps at min(max_tool_iterations, # max_turns). Previously DEAD config (assigned, never read); now wired as # a real ceiling. `--max-turns N` overrides max_tool_iterations directly. "max_turns" => 90, # Per-turn model↔tool round-trip cap. Raised 8→25 (#399): 8 was a # rubino-only outlier (the Hermes reference uses 90; peer tools cluster # 10–25 for "stop-and-ask"). 25 matches Cursor's tuned interactive cap — # high enough that real multi-file tasks finish, low enough to still # catch runaways. Kept at 25 (a deliberate prior decision, #414). "max_tool_iterations" => 25, # At the iteration cap, in INTERACTIVE mode, prompt the user to # continue/summarize/abort instead of silently force-summarizing (#399). # false forces the old always-summarize behaviour; headless/non-TTY # runs ALWAYS force-summarize regardless of this flag (no human to ask). "budget_extension_prompt" => true, # The "+N" granted by one budget extension at the cap. nil ⇒ use # max_tool_iterations (so one extension doubles the runway). Capped by # the outer max_turns rail, which extensions do NOT raise — repeated # extensions can never bypass the iteration/turn ceiling. "budget_extension_step" => nil, # Pure SAFETY-NET wall clock on a single turn, NOT a working-time cap # (#408). Hermes' IterationBudget has no clock at all; the old 120s # KILLED slow-but-legitimate test/build turns mid-work (and was the # root that made the #403 budget-extension loop possible). Raised to a # backstop only a genuinely runaway turn should ever hit. nil disables. "max_turn_seconds" => 600, # 5 retries with exponential backoff = 1+2+4+8+16 = 31s total wait. # Sized to absorb common provider blips (MiniMax intl in particular # has been observed returning "API server error - please try again" # for ~15-25 seconds before recovering) without timing out the user. "api_max_retries" => 5, # Hard ceiling (seconds) on a single full-jitter backoff draw between # retries on the ERROR path: delay = min(base*2^(n-1), cap) + jitter. # Caps worst-case per-retry wait so a flapping backend can't stall a # turn on one sleep. 16 keeps the worst single wait to ~24s (16 + # 0.5*16 jitter) instead of the 60s ERROR_PATH ceiling. (Previously # declared but NEVER read — the error path hardcoded the 60s cap.) "api_retry_backoff_cap_seconds" => 16, # Hard TOTAL wall-time budget (seconds) across all error-path retries # for one model call. A permanently-unreachable host (resolves but the # port is dead → retryable connection timeout) used to burn ~75-110s # across 5 retries before giving up. This is a Codex-style "total # elapsed" cap: keep retrying genuinely-transient errors, but once the # cumulative backoff already spent PLUS the next planned wait would # cross this budget, fail fast with a clear "gave up after ~Ns" message # instead of stalling the user. Does NOT shorten legitimate recovery # inside the window. nil ⇒ no total cap (count-based only). "api_retry_total_timeout_seconds" => 30, # Higher ceiling used ONLY for overload (529/503) and MiniMax "unknown # error" blips: those backends stay overloaded for tens of seconds, so # the 16s cap retries too eagerly back into a still-hot endpoint. 60s # lets the backoff ride out the overload window (the reference uses 120s). "api_retry_backoff_overload_cap_seconds" => 60, # In-turn retries for a 200-OK-but-EMPTY model response (no text, no # tool calls). After this many re-issues of the same turn the Loop # raises EmptyModelResponseError → run marked failed (never a silent # "completed but empty"). Mirrors the reference treating an empty/invalid # response as retryable-then-terminal. "empty_response_max_retries" => 2, # Provider/model fallback chain (Slice 7 — Agent::FallbackChain). An # ORDERED list of backends to rotate to when the primary keeps failing # (invalid/empty responses, rate-limit, overload, exhausted retries). # The primary is implicit (index 0); these are the fallbacks tried in # order. EMPTY by default → no fallback, behaviour byte-identical to a # single-provider setup. Each entry: # { "provider" => "anthropic", "model" => "claude-...", # "base_url" => nil, "api_key" => nil } # provider + model are required; base_url/api_key override the # providers.<name> config for that entry (custom endpoints). An entry # that resolves to the current provider/model/base_url is skipped # (dedup) so we never fall back to the backend that just failed. "fallback_models" => [], "disabled_toolsets" => [], "tool_use_enforcement" => "auto" }, "run" => { # SSE watchdog: when a run is "running" but no new event has been # written for this many seconds, EventsOperation marks it failed and # emits a synthetic run.failed frame. Covers cases the executor's # rescue can't (model in infinite tool loop, provider stream hung, # OS-level thread death). Set to nil to disable. "idle_event_timeout" => 300 }, "database" => { # Sentinel: resolved at read time (Configuration#database_path) to # "<resolved home>/rubino.sqlite3" so the DB follows # RUBINO_HOME like config/.env/skills do. An explicit override # in config.yml replaces this and is used verbatim (issue #96). "path" => DEFAULT_DATABASE_PATH }, "paths" => { "home" => "~/.rubino", "memory" => "~/.rubino/memories", "skills" => "~/.rubino/skills", "cron" => "~/.rubino/cron", "sessions" => "~/.rubino/sessions", "logs" => "~/.rubino/logs" }, "ui" => { "adapter" => "cli", "theme" => "default", "verbose" => false }, "display" => { "streaming" => true, # Tri-state reasoning render (display.reasoning): "hidden" suppresses # thinking entirely, "collapsed" buffers it and commits a one-liner cue # ("thought for Ns"), "full" renders the whole reasoning as a dim aside # above the answer. Deliberately NOT seeded here (#132): defaults # injecting it made the documented legacy display.show_reasoning # mapping (true→full, false→hidden, applied only when # display.reasoning is unset) unreachable for every config loaded # normally. Config::ReasoningPrefs supplies the "collapsed" default # when neither key is set. "language" => "en", "runtime_footer" => { "enabled" => false }, "interim_assistant_messages" => false, # The dim status bar pinned UNDER the chat input (model id + context # saturation), refreshed at turn boundaries. Omitted automatically # off a TTY or on terminals narrower than 40 columns. "statusbar" => true, # Head lines of each tool's output shown in the transcript before a # dim "… +N lines (full output → context)" marker. DISPLAY-ONLY — # the model always receives the full (truncation-capped) output. # 0 disables the collapse (old full dump). "tool_output_preview_lines" => 3, # Cap on the chat input's visual rows: a long/multi-line prompt # wraps and grows the input downward up to this many rows, then # scrolls vertically (caret kept in view). "input_max_rows" => 8 }, "paste" => { # File-backed paste pipeline (UI::PasteStore). A paste with MORE # than collapse_lines lines collapses to a "[Pasted text #N +M # lines]" placeholder in the chat input, expanded to the full body # when the message is sent (the transcript echo keeps the # placeholder). A paste estimated above file_threshold_tokens # (chars/4) is written to <home>/sessions/<id>/paste_N.txt instead # and the sent message carries a read-tool pointer to it. "collapse_lines" => 5, # A paste longer than this many CHARS also collapses to the chip, even # on a single line — a big one-line paste (long URL/token/minified # JSON) would otherwise flood the composer. "collapse_chars" => 400, "file_threshold_tokens" => 8000 }, "notifications" => { # Attention signals (UI::Notifier) for the moments the agent needs # human eyes: a long turn finishing, an approval prompt, a blocked # subagent. CLI-only; never emitted into a pipe. "enabled" => true, # Ring the terminal bell (BEL). On iTerm2 an OSC 9 escape is also # sent so it surfaces as a native macOS notification. "bell" => true, # Optional shell command spawned non-blocking per event with # RUBINO_EVENT (turn_finished|needs_approval|blocked) and # RUBINO_MESSAGE in its env — e.g. osascript / notify-send. "command" => nil, # A turn must run at least this many seconds before its completion # notifies; quick turns stay silent. "min_turn_seconds" => 10 }, "thinking" => { # Reasoning effort: off | low | medium | high. Mapped to an Anthropic # thinking-token budget (off→0, low→4000, medium→8000, high→16000) on # the anthropic-family path. "off" disables thinking. When SET it wins # over the model/provider thinking_budget chain; left nil (the default) # the budget falls through that chain, whose own default is 8000 — i.e. # the effective default effort is already "medium". /think reports # "medium" for the nil case. "effort" => nil }, "streaming" => { "enabled" => true, "transport" => "off", "edit_interval" => 0.3, "buffer_threshold" => 40 }, "context" => { "engine" => "compressor", "max_tokens" => nil }, "compression" => { "enabled" => true, "threshold" => 0.50, "target_ratio" => 0.20, "protect_first_n" => 3, "protect_last_n" => 20, "max_summary_tokens" => 12_000, "preserve_tool_pairs" => true }, "memory" => { "enabled" => true, "backend" => "sqlite", "auto_extract" => true, # Background session-summary aux-LLM job (SummarizeSessionJob), enqueued # once a session passes the message threshold. Gateable like # auto_extract / skills.auto_distill so the whole background aux-LLM # surface can be turned off together (e.g. an engine-vs-engine # benchmark that wants ONLY the task, no side-work). Default on. "auto_summarize" => true, # Throttle the background aux-LLM memory extraction to ~every N turns # instead of EVERY turn (#412), mirroring Hermes' nudge_interval (10): # extraction enqueues only when turns-since-last-extract >= this. 10x # fewer aux calls + far less of the conversation shipped to the # extractor. nil/<=1 = every turn (old behaviour). The extract is also # ALWAYS backgrounded off the interactive critical path (never drained # inline on the live CLI turn). "auto_extract_interval" => 10, "auto_save" => true, "user_profile_enabled" => true, "project_context_enabled" => true, "memory_char_limit" => 2200, "user_char_limit" => 1375, # Ingest/store cap for the live memory set, kept SEPARATE from the # injection budget above. `memory_char_limit` only bounds what gets # packed into the prompt at RETRIEVAL time; storing facts must not be # throttled by it or long multi-session conversations stall once the # injection budget fills. `nil` = unbounded ingest (the default). "ingest_char_limit" => nil, # Bounded retry budget for the aux extraction call on a transient # error (429 rate-limit / overloaded / 5xx). Under concurrent load the # aux call used to drop the fact on the first RateLimitError; now it # backs off and retries up to this many times (honouring Retry-After) # before giving up, and the per-session cursor re-feeds the turn next # time even then — so memory isn't lost to a transient rate limit. "extract_max_retries" => 3, # SQLite memory backend tuning. `vector` enables best-effort # sqlite-vec/RubyLLM.embed KNN on top of the always-on FTS5 hybrid; # off by default so the stock install needs no extra deps. `graph` # is the graph-lite 1-hop entity/edge blend (on by default). "sqlite" => { "vector" => false, "graph" => true } }, "jobs" => { "mode" => "inline", "poll_interval" => 2, "max_attempts" => 3, "retry_backoff_seconds" => 30, # How long a CLAIMED (queued -> running) row may stay `running` before # it's presumed abandoned and reclaimed (#76). A worker that claims a # row and then dies / is quit / hangs leaves it `running` with # locked_by set; nothing in the queue scan re-picks a `running` row, so # pre-fix it sat forever (0 attempts, queue grew across sessions). The # next drain reclaims any row whose lock is older than this lease, # bumping attempts so a genuinely stuck job still goes terminal at # max_attempts rather than re-running forever. Generous (15 min) so a # legitimately slow aux-LLM job is never yanked out from under itself. "lock_lease_seconds" => 900 }, # Nested-subagent (the `task` delegation tool) caps. A subagent CAN now # spawn its own subagents; these three caps bound the tree so depth × # fan-out cannot blow past the process's thread/cost budget. All three are # enforced in ONE place — Tools::BackgroundTasks#reserve — which refuses a # spawn (the tool then surfaces a clear at-capacity / max-depth message). "tasks" => { # Max nesting depth. depth 0 = a human/top-level-spawned child; the cap # bounds chains of subagents-spawning-subagents. 2 ⇒ human→child→grandchild # (no deeper). "max_depth" => 2, # Max LIVE direct children one node (human/top-level or a single # subagent) may have at once. "max_children_per_node" => 3, # Hard global ceiling on total LIVE subagents across the whole tree. "max_concurrent_total" => 8, # Per-child budget for BILLED live probes (`probe(live:true)`): how many # times an owner may run a one-shot model peek over a single child's # transcript. Over budget → the model is told to use the FREE # live:false snapshot instead. Free snapshots are unlimited. "max_live_probes_per_child" => 5, # Bound (seconds) a BLOCKING ask_parent waits before the child # self-heals and proceeds with its best judgement (S5a). Matches the # approvals wait-timeout default — never "forever". "ask_parent_timeout" => 900 }, "tools" => { # Sandbox write/edit/delete tools to workspace_root (terminal.cwd # or Dir.pwd). Set to false to let the model touch any path the # process can reach — only do this if you trust the model + the # approval flow alone. "workspace_strict" => true, # Default ON: the agent ships to run inside an isolated per-customer # VM where running shell commands is the whole point. The blast radius # is the VM, and security.confirm_policy (default dangerous_only) still # routes any DangerousPattern command through an approval prompt while # safe commands run unprompted (set confirm_policy: confirm_all to gate # every command). "shell" => true, "ruby" => true, # OS-level write jail around the shell tool's single Process.spawn # (#290/#544). The per-command allowlist is a UX guard-rail, not a # boundary; this is the real floor (Security::Sandbox). # mode: off | read-only | workspace-write (default workspace-write) # off — no OS confinement (byte-identical to pre-sandbox) # read-only — workspace is read-only; writes only to temp/home # workspace-write— writes confined to workspace + temp + ~/.rubino # network: allow (slice 1; deny/proxy are slice 3+) # extra_writable: extra absolute paths added to the write jail # require: false — fail-OPEN when no mechanism exists (default). # Set true to FAIL-CLOSED: shell (foreground AND background) # REFUSES to run when the sandbox is unavailable, for the # paranoid / multi-tenant-host operator (slice 2 Part B). # Always-on when a mechanism exists, INCLUDING under --yolo (--yolo # skips approval prompts, not the OS write-jail). Fails OPEN with a # one-time loud banner where no mechanism is available (old kernel, # non-mac/linux); set mode: off to silence it deliberately. "sandbox" => { "mode" => "workspace-write", "network" => "allow", "extra_writable" => [], "require" => false }, # Default ON, matching Hermes (web tools ship in the default toolset, # keyless via the DuckDuckGo backend) (#411). Gated at runtime on # backend reachability in Registry#web_backend_available? so an # unreachable network DEGRADES gracefully (the tool is hidden / its # call returns an error string) rather than crashing a turn. "web" => true, "memory" => true }, "tool_output" => { "max_bytes" => 50_000, "max_lines" => 2000, "max_line_length" => 2000, # Hard RAM ceiling on what the shell tool RETAINS while draining a # subprocess pipe (#539). Independent of `max_bytes` (the model-facing # truncation budget): an unbounded producer (`cat /dev/zero`, `yes`) # emits faster than we can shape, so the capture seam keeps at most a # bounded head+tail of this many bytes and KILLS the producer once the # cap is hit — RAM stays bounded no matter how much the process emits. # Sized well above max_bytes so the downstream truncate still has its # full head/tail budget, but small enough to never OOM the agent. "capture_max_bytes" => 2_000_000 }, # Deterministic, REVERSIBLE compression of tool output, routed through # the single Compression::ContentRouter seam in the ToolExecutor. The # master `enabled` flag gates the whole seam; the per-type sub-config # (`code`, `logs`) tunes each strategy. When on, the router DETECTS the # content type and dispatches: a test/build/lint dump → LogCompressor # (every failure + summary kept, passing noise dropped); a WHOLE-file # Ruby read → skeleton (signatures + small bodies verbatim, large bodies # elided behind a pointer that IS a targeted `read offset/limit`). A # diff, a grep/search result, and short output PASS THROUGH byte- # identical, so exact-string anchors edit/grep rely on are never touched. # On any compression the full original is spilled to # tool-results/<call_id>.txt and the output ends with a pointer the model # can `read` back. OFF by default: with this flag false every tool output # is byte-for-byte unchanged. Per-call `compress:false` on read/shell # forces verbatim output even when enabled. "tool_output_compression" => { "enabled" => false, "code" => { "strategy" => "skeleton", # Don't bother skeletonising a file shorter than this many lines — # the pointer indirection isn't worth it on small files. "min_lines" => 150, # Method bodies up to this many lines are kept VERBATIM; only larger # bodies are elided behind a pointer. "keep_method_body_max_lines" => 8, # Which source languages get skeletonised. Remove one ⇒ no compression # for it (its whole-file reads pass through verbatim). Ruby uses the # built-in Prism parser; other languages added in later slices need # their own parser before they can be enabled here. "languages" => %w[ruby] }, # LOG/command-output compression (test runs, linters, build/shell # dumps). The high-ROI channel: the agent reads command output WHOLE, # and the signal (failures + the final tally) is a tiny fraction of the # bytes. Keeps every error/failure + summary VERBATIM, drops passing/ # info noise, appends a pointer to retrieve the original. OFF by default # (own flag, independent of `code`) — we measure before flipping it on. "logs" => { "enabled" => false, # Outputs shorter than this pass through UNCHANGED. "min_lines" => 40, # Hard cap on kept lines. "max_total_lines" => 100, # Keep every error/failure up to this many (first & last always). "max_errors" => 10, "max_warnings" => 5, "max_stack_traces" => 3, # Lines of surrounding context kept around each failure. "context_lines" => 4 }, # DIFF compression (model-facing `:output` of a `git diff` / unified # diff). The human view is the tool `:body` (the full coloured diff in # scrollback) and is NEVER touched — only the model's copy is trimmed. # No own `enabled` flag (like `code`): active when the master flag is # on; the saving guard below is the real gate. Keeps every +/- line and # every file/hunk header; trims far context and elides generated/lock # files. A small/tight diff passes through byte-identical automatically. "diff" => { # Unchanged context kept on each side of a change; far context is # collapsed into a `… N unchanged lines` marker. "context_lines" => 3, # Diffs shorter than this pass through UNCHANGED (the common # "show me the diff" case the human wants to see verbatim). "min_lines" => 40, # Only apply when the compressed result is at least this much # smaller; otherwise byte-identical passthrough. "min_saving" => 0.25, # Changed files matching any of these collapse to a one-line summary # (`path: +X/-Y lines, N hunks — elided (generated)`). A trailing `/` # matches a directory; a `*` glob matches the basename. "generated_patterns" => %w[ *.lock Gemfile.lock package-lock.json yarn.lock pnpm-lock.yaml composer.lock *.min.js *.min.css dist/ build/ *.snap vendor/ ] }, # JSON compression (a whole-output JSON dump from a tool — `curl | jq`, # `kubectl get -o json`, `gh api`, `docker inspect`, `aws --output # json`, or an MCP/custom-tool JSON result). Modelled on headroom's # SmartCrusher: an array of UNIFORM objects folds LOSSLESSLY to a # schema header + one compact row per item (the repeated key names are # emitted once); a large array whose fold is too thin falls back to # LOSSY row selection where error-bearing rows and statistical outliers # always survive and dropped rows collapse to an `{"_elided": N}` # sentinel; a single large object elides only big string values and # never drops a key. No own `enabled` flag (like `code`/`diff`): active # when the master flag is on; the saving guard below is the real gate, # so small JSON the model wants verbatim passes through byte-identical. # Detection runs BEFORE the log channel — a whole-output JSON dump from # `shell` routes here, never to the log compressor. "json" => { # Arrays with fewer than this many items (and text shorter than # min_lines) pass through UNCHANGED — small JSON stays verbatim. "min_items" => 8, # Objects / text shorter than this many lines pass through unchanged. "min_lines" => 40, # Only apply when the compressed result is at least this much smaller; # otherwise byte-identical passthrough. "min_saving" => 0.25, # A numeric field more than this many standard deviations from its # column mean marks a row as a statistical outlier (kept in the lossy # fallback). "outlier_sigma" => 3.0, # In a single large object, string values longer than this collapse # to a `<elided N chars>` placeholder (the key is always kept). "max_string_chars" => 400 } }, "file_read" => { "max_chars" => 100_000 }, "terminal" => { "backend" => "local", "cwd" => nil, "file_sync_enabled" => false, "file_sync_max_mb" => 100 }, "approvals" => { "mode" => "manual", # Auto-allow provably READ-ONLY shell commands (ls, pwd, cat, grep, # git log, ...) without an approval prompt. The whole line must # parse as safe (Security::ReadonlyCommands): no redirection or # command/process substitution, every pipe/&&/; segment from the # read-only set, no mutating flags (find -exec/-delete, ...). # Anything ambiguous still prompts. The hardline floor and # permissions:deny always run first, so this never weakens them. "auto_allow_readonly" => true, # Extra command names (or leading-token prefixes, e.g. "docker ps") # merged into the built-in read-only set. The same parse validation # applies to every segment. "readonly_commands" => [], # How long (seconds) a run waits on a human approval/clarification # before giving up. On expiry the gate AUTO-DENIES (never approves) # and frees the worker thread — an abandoned approval (closed tab, no # answer) must not park a server worker indefinitely (W1). A sane # bound (15 min), not the old 24h that effectively never released. # Set to nil for a truly unbounded wait (interruptible only by an # explicit run stop; discouraged on shared servers). While a decision # is pending the SSE idle watchdog is suspended for that run # (EventsOperation), so the run is never reaped mid-wait. "wait_timeout_seconds" => 900 }, # SSRF guard for Run::AttachmentDownloader. Only URLs whose host is in # this list (case-insensitive) are fetched into the run workspace; the # downloader refuses everything else. ENV["ALLOWED_FILE_URL_HOSTS"] # (comma-separated) is merged in too, so a downstream consumer can keep # using its existing env knob. Loopback hosts (localhost, 127.0.0.1, ::1) are # ALWAYS allowed on top of this list, since an HTTP client co-located on the # same host produces loopback attachment URLs. # Empty list + empty env = only loopback is fetchable. "attachments" => { "allowed_hosts" => [], # Secure-by-default policy for the universal file-attachment handler # (Attachments::Classify / Preamble). Every default is on the secure # branch; explicit user config wins (Configuration merges over these). # Fail closed: oversize / unsafe / disallowed-kind => warn + skip. "policy" => { # Hard cap on accepted file size, enforced via lstat BEFORE reading. "max_file_bytes" => 26_214_400, # 25 MB # Inline budget for text files; over budget => head + read-rest note. "inline_text_budget_bytes" => 100_000, # ~25k tokens # Kinds the handler will process. Deny one by removing it. "allow_kinds" => %w[image text document archive binary], # Documents are hint-only by default (cost / injection blast radius); # the flag is reserved for a future in-process extract path. "auto_extract_documents" => false, # Decompression-bomb / runaway-conversion caps for the in-process # document converters (Documents::Limits). The 25 MB on-disk # max_file_bytes is trivially defeated by zip compression (a 100 KB # .docx expands to ~34 MB of XML / 1M paragraphs), so the converter # caps BEFORE/DURING conversion: a paragraph/row/page/slide count # ceiling, an accumulated decompressed-bytes ceiling (also checked # against the OOXML central directory BEFORE the gem inflates), and # a wall-clock budget. On any cap it bails to the shell-extraction # hint instead of hanging / OOM-killing the turn. "convert_max_elements" => 50_000, "convert_max_decompressed_bytes" => 5_000_000, # ~5 MB extracted text "convert_wall_clock_seconds" => 15.0, # Routing an image to an EXTERNAL aux model is data egress; on by # default to preserve the existing aux-vision behaviour. "aux_vision_egress" => true, # Caps for any in-process archive listing (hint-only today, so # unused unless listing is enabled). "archive" => { "max_entries" => 2000, "max_uncompressed_bytes" => 268_435_456, "max_entry_ratio" => 100, "max_total_ratio" => 50, "max_nesting_depth" => 1 } } }, "security" => { # Prompt policy for shell commands not otherwise allowed/denied: # dangerous_only (DEFAULT, reference-faithful) safe commands run # unprompted; only DangerousPatterns matches prompt. # confirm_all (opt-in hardening) every such command prompts. # Aligned to Hermes (#409): Hermes has no confirm-policy concept — # detect_dangerous_command is its SOLE prompt trigger; non-dangerous # commands run unprompted. The old confirm_all default prompted on # every npm test / make / ls — huge DX friction. The hardline floor # and permissions:deny always precede this regardless of policy, so # dangerous_only never weakens the non-bypassable floor. Set # confirm_policy: "confirm_all" to restore prompt-on-everything. "confirm_policy" => "dangerous_only", # EMPTY by default (#409), aligning to Hermes' empty allowlist: once # the prompt policy is dangerous_only, safe commands (incl. git status # / git diff) already run unprompted via the policy + read-only # auto-allow, so the seeded entries were non-load-bearing. A # code-loading runner (`bundle exec rspec`, `rake`, `npm test`) is # still NOT safely allowlistable (SEC-R2-3: `rspec -r FILE` is RCE); # users who want exact-command pre-approval opt in explicitly. "command_allowlist" => [], # Redact credential VALUES (API keys, tokens, private keys, DB # passwords, JWTs…) from tool output before it enters context, the # transcript, or the aux model. ON by default (secure default, # Hermes #17691). Applied to read / grep / shell / shell_output / # shell_tail / summarize_file content via Security::Redactor. Set # false ONLY when you need raw credential values in tool output # (e.g. working on the redactor itself). NOT a security boundary — # the shell runs as the same OS user; this is defense-in-depth. "redact_secrets" => true, "website_blocklist" => { "enabled" => false, "domains" => [], "shared_files" => [] } }, # Repeated-identical-tool-call guard (DoomLoopDetector). Aligned to # Hermes' tool_guardrails (#414): hard_stop OFF by default (WARN, don't # block) and a higher threshold, so a legitimate 3rd retry of an # idempotent read is no longer hard-denied. With hard_stop:false the # policy surfaces a doom-loop WARNING to the model on the Nth identical # call but still lets it through; set hard_stop:true to restore the old # block-at-threshold behaviour. "doom_loop" => { "hard_stop" => false, "threshold" => 5 }, "privacy" => { "redact_pii" => false }, # #552: how long an interactive `question`/clarify waits for the human # before it EXPIRES CLEANLY (the agent proceeds with its best judgement), # mirroring tasks.ask_parent_timeout and Hermes' agent.clarify_timeout. # Generous (10 min) — long enough to read a multi-option menu and answer, # short enough that an abandoned prompt eventually unblocks the run. This # is the BLOCKING-tool wait bound; the stale-chunk watchdog is separately # suspended for the tool's whole runtime (RubyLLMAdapter#stream_once), so # it never pre-empts this timeout. "clarify" => { "timeout" => 600 }, "worktree" => { "enabled" => false }, # System-prompt layering. Defaults ship the built-in role prompts # from lib/rubino/agent/prompts/*.txt. Customers customise via # config.yml: # prompts.preamble — single block prepended after the role # identity; the natural place for "You are running inside # <product>" customer context. # prompts.environment.enabled — when true (default) the assembler # injects an [Environment] block with date/OS/cwd/git/runtimes # and the list of CLI utilities found on PATH. Cached per # process — re-probed every boot, not every turn. # prompts.environment.extra_utilities — additional binaries to # probe beyond EnvironmentInspector::DEFAULT_UTILITIES. # prompts.overrides.<role> — full replacement of the built-in # role prompt (escape hatch; prefer preamble for incremental # tweaks). # prompts.prompt_cache — when true (default) the assembler emits # Anthropic prompt-cache breakpoints (cache_control) on the stable # system prefix and the last tool definition, so the fixed prompt # prefix + tool block are cached across turns (#311). The volatile # tail (fresh relevant-memories + post-compaction summary) is kept # AFTER the system breakpoint so the cached bytes stay byte-stable. # Honored by anthropic-family providers; other providers ignore it. "prompts" => { "preamble" => nil, "environment" => { "enabled" => true, "extra_utilities" => [] }, "overrides" => {}, "prompt_cache" => true }, "quick_commands" => {}, "mcp" => { "servers" => {} }, "skills" => { "enabled" => true, # Post-turn skill distillation (Variant B). When true, a successful, # tool-heavy turn enqueues DistillSkillJob, which spends ONE auxiliary # model call to distil a reusable SKILL.md. Mirrors memory.auto_extract: # a separate toggle from `enabled` (which only controls whether skills # are loaded/usable) so a deployment — or a test that scripts a fixed # number of LLM turns — can keep skills usable while turning off the # extra background aux call. "auto_distill" => true, # Throttle post-turn skill distillation to ~every N turns (#414), # mirroring memory.auto_extract_interval, so a tool-heavy session # doesn't spend an aux-model call every single turn. nil/<=1 = every # eligible turn. The job's own deterministic gate still applies on top. "auto_distill_interval" => 10, # Discover the skills shipped *inside the gem* (skills/<name>/SKILL.md), # so every install gets the built-in catalogue (e.g. ruby-expert) with # no copy step, on top of the user paths below. Built-ins are scanned # first, so a same-named user skill still overrides them. Set false to # run with only your own skills. "include_builtin" => true, "paths" => [ ".rubino/skills", "~/.rubino/skills" ] }, "commands" => { "paths" => [ ".rubino/commands", HOME_COMMANDS_PATH ], # When false (default), !`shell` interpolation in command templates is # disabled. Set to true only in trusted environments where you explicitly # want command templates to execute shell commands. "shell_injection_enabled" => false }, "permissions" => {}, "formatters" => {}, "agents" => {}, "server" => { "port" => 4820, "auth" => false }, "api" => { # Hard cap on JSON request bodies. Anything past this (whether # advertised by Content-Length or revealed mid-read) is rejected # with 413 before the parser allocates the full payload — keeps a # multi-GB POST from OOM-killing the process. "max_body_bytes" => 5 * 1024 * 1024, # Hard cap on multipart upload payload (POST /v1/files). Checked # against Content-Length first, then enforced mid-stream so a # truncated/missing Content-Length cannot saturate the disk. "max_upload_bytes" => 50 * 1024 * 1024, # Token-bucket rate limiter. Unauth bucket (per remote IP) protects # /v1/health and /v1/metrics from public floods; auth bucket (per # bearer token) caps authenticated callers. Storage is in-memory, # so multi-process deployments need a shared backend before this # gives meaningful protection across workers. "rate_limit_enabled" => true, "rate_limit_unauth_per_minute" => 60, "rate_limit_auth_per_minute" => 600, # SAFE BY DEFAULT (#577). The API can execute shell tools, so binding # it to a non-loopback address (--host 0.0.0.0 / RUBINO_API_HOST set to # anything other than 127.0.0.1 / ::1 / localhost) publishes an # RCE-capable surface to the network. TLS is off by default, so the # bearer token and all traffic would travel in cleartext. Booting on a # non-loopback host is therefore REFUSED unless this is explicitly set # to true; loopback binds are unaffected. When you do opt in, enable # TLS (RUBINO_TLS=1) + a strong RUBINO_API_KEY and prefer a reverse # proxy over exposing the listener directly. "allow_public_bind" => false } }.freeze
Class Method Summary collapse
- .deep_dup(obj) ⇒ Object
- .dig(*keys) ⇒ Object
-
.to_hash ⇒ Object
Deep copy so a Configuration#set on a never-overridden nested section (e.g. display.reasoning) mutates the per-config hash, NOT the shared MODULE_DEFAULTS constant.
- .to_yaml ⇒ Object
Class Method Details
.deep_dup(obj) ⇒ Object
879 880 881 882 883 884 885 |
# File 'lib/rubino/config/defaults.rb', line 879 def deep_dup(obj) case obj when Hash then obj.each_with_object({}) { |(k, v), h| h[k] = deep_dup(v) } when Array then obj.map { |v| deep_dup(v) } else obj end end |
.dig(*keys) ⇒ Object
891 892 893 |
# File 'lib/rubino/config/defaults.rb', line 891 def dig(*keys) MODULE_DEFAULTS.dig(*keys) end |
.to_hash ⇒ Object
Deep copy so a Configuration#set on a never-overridden nested section (e.g. display.reasoning) mutates the per-config hash, NOT the shared MODULE_DEFAULTS constant. A shallow .dup left nested section hashes aliased to the constant, so the first /reasoning or /think write poisoned the process-wide default.
875 876 877 |
# File 'lib/rubino/config/defaults.rb', line 875 def to_hash deep_dup(MODULE_DEFAULTS) end |
.to_yaml ⇒ Object
887 888 889 |
# File 'lib/rubino/config/defaults.rb', line 887 def to_yaml MODULE_DEFAULTS.to_yaml end |