RLM.rb
Recursive Language Models for Ruby.
RLM.rb is a Ruby runtime for typed, sandbox-oriented, auditable AI jobs over large application context. It integrates with RubyLLM for provider access and dspy.rb for typed signatures. The current plain Ruby milestone includes the recursive execution spine: prompt loop, file and context mounting, recursive sub-LM calls, typed final output, budget controls, trace events, a RubyLLM LM adapter, a dspy signature adapter, and a minimal trace persistence hook.
Status: Plain Ruby adapter milestone. The released gem is v0.2.0. It includes
RLM::Lm::RubyLLM,RLM::Signature::Dspy,RLM::Lm::Mock,RLM::Sandbox::UnsafeInProcess, budget enforcement and budget policies, trace events, recursivepredict, prompt building, and a best-efforttrace_storecallable hook. Rails integration, subprocess/container sandboxing, tools, skills, cache, telemetry, and evals remain future milestones.UnsafeInProcessis dev/test-only and executes generated code in the host Ruby process.
Why
- Large context breaks simple prompting.
- Manual chunking and summarization are brittle.
- Hand-rolled agent loops have unclear state, unclear cost, and poor auditability.
RLM.rb replaces those with a bounded Ruby runtime where the model explores context programmatically, calls smaller typed LLM functions only when needed, and returns validated Ruby objects with a full execution trace.
Install
RLM.rb requires Ruby 3.3 or newer. Ruby 3.2 and older are not supported because dspy.rb is mandatory for the plain Ruby adapter milestone.
Add the gem to your Gemfile:
gem "rlm-rb"
Or install directly:
gem install rlm-rb
Configuration
RLM.configure do |config|
config.root_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
config.sub_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
config.sandbox = RLM::Sandbox::Mock.new
config.default_limits = RLM::Limits.new(
max_iterations: 8,
max_llm_calls: 25,
max_tool_calls: 20,
max_runtime_seconds: 120,
max_cost_cents: 100,
max_recursion_depth: 1
)
end
RLM::Lm::RubyLLM creates a fresh RubyLLM.chat for each runtime LM call. That keeps RLM prompts standalone and
prevents conversation history from leaking between root and sub-model calls.
Plain Ruby API
require "dspy"
require "rlm"
class InvoiceExtraction < DSPy::Signature
description "Extract normalized invoice fields from a vendor invoice."
input do
const :invoice_text, String
const :vendor_id, Integer
end
output do
const :vendor_name, String
const :invoice_number, String
const :total_cents, Integer
end
end
RLM.configure do |config|
config.root_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
config.sub_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
config.sandbox = RLM::Sandbox::UnsafeInProcess.new # dev/test only
end
signature = RLM::Signature::Dspy.new(InvoiceExtraction)
result = RLM.predict(
signature,
input: {
invoice_text: "Vendor: Acme\nInvoice: INV-001\nTotal: $100.00",
vendor_id: 123
},
limits: RLM::Limits.new(max_iterations: 8, max_llm_calls: 25)
)
result.output
# => { vendor_name: "Acme", invoice_number: "INV-001", total_cents: 10000 }
result.trace.events.find { |event| event[:type] == :root_lm_called }[:payload][:usage]
# => { model_id: "...", input_tokens: ..., output_tokens: ..., cost_cents: ..., cost_known: true }
Usage metadata is recorded on :root_lm_called and :sub_lm_called trace events when an adapter exposes it. It is not
duplicated onto RLM::Result in this milestone. RubyLLM cost helpers can return nil when model pricing is unknown;
RLM records cost_known: false, contributes 0 cents for that call, and cannot enforce unknown provider cost.
Run a Live Plain Ruby Example
The gem ships one opt-in live example at examples/plain_ruby_invoice_extraction.rb. By default it exits before
provider credential checks, LM configuration, or RLM.predict, even if provider credentials are already present:
bundle exec ruby examples/plain_ruby_invoice_extraction.rb
To run the live path, configure provider credentials and opt in explicitly:
RLM_RUN_LIVE_EXAMPLE=1 OPENAI_API_KEY="$OPENAI_API_KEY" \
bundle exec ruby examples/plain_ruby_invoice_extraction.rb
The example uses RLM::Lm::RubyLLM for root and sub-LM calls, wraps a real DSPy::Signature with
RLM::Signature::Dspy, calls the public RLM.predict(...) API, and prints result status, typed output, trace id, cost,
and usage payloads when RubyLLM exposes them. Set RLM_EXAMPLE_MODEL and RLM_EXAMPLE_SUB_MODEL to override the
default model.
The live example uses RLM::Sandbox::UnsafeInProcess, which is dev/test-only and runs generated Ruby code in the host
process. Rails integration, subprocess/container sandboxing, tools, skills, evals, telemetry, and production execution
examples remain future milestones.
Mock Runtime API
class InvoiceExtraction
def self.name = "InvoiceExtraction"
def self.description = "Extract normalized invoice fields from a vendor invoice."
def self.input_fields = { invoice_pdf: :file, vendor_id: :integer }
def self.output_fields = { vendor_name: :string, invoice_number: :string, total_cents: :integer }
def self.validate_input(input) = input.key?(:vendor_id) ? [] : ["vendor_id is required"]
def self.validate_output(output) = output.key?(:vendor_name) ? [] : ["vendor_name is required"]
end
# Mock LM for testing (no provider needed)
lm = RLM::Lm::Mock.new(responses: ['<rlm-final>{"vendor_name":"Acme","invoice_number":"INV-001","total_cents":10000}</rlm-final>'])
result = RLM.predict(
InvoiceExtraction,
input: { vendor_id: 123 },
lm: lm,
sandbox: RLM::Sandbox::UnsafeInProcess.new, # dev/test only: executes in host process
limits: RLM::Limits.new(max_iterations: 8, max_llm_calls: 25)
)
result.output # { "vendor_name" => "Acme", ... }
result.trace # full event stream
result.cost_cents # accumulated cost
result.status # :completed, :budget_exceeded, :failed_validation, ...
dspy Signature Adapter
RLM::Signature::Dspy wraps a DSPy::Signature class behind RLM's internal signature protocol:
descriptioninput_fieldsoutput_fieldsvalidate_inputvalidate_outputcoerce_output
The adapter derives fields and simple validation from dspy JSON schema metadata. Output coercion normalizes parsed JSON/hash output to schema keys before validation.
Rails
Rails integration is not yet implemented. Rails remains a v2 milestone tracked in docs/postponed-issues.md.
What's Implemented
| Component | Status |
|---|---|
RLM.configure / RLM.config |
Ready |
RLM::Limits with PRD defaults |
Ready |
RLM::File (path / text / io / ActiveStorage blob) |
Ready |
RLM::Context with sandbox-safe manifest |
Ready |
RLM::Trace with NDJSON / JSON export |
Ready |
RLM::Result with full status enum |
Ready |
RLM::Sandbox::Base interface + Mock backend |
Ready |
RLM::Sandbox::UnsafeInProcess |
Ready for dev/test only; executes in host process and mutates global streams during serialized capture |
RLM::Tool base class with category DSL |
Ready |
| Error hierarchy | Ready |
RLM::Predict#call |
Delegates to RLM::Runtime |
RLM::Runtime mock loop |
Ready (with RLM::Lm::Mock) |
RLM::PromptBuilder |
Ready (v0.2 contract) |
RLM::CodeExtractor |
Ready |
RLM::Runtime::Bridge |
Ready for runtime-owned subcalls, tools, submission, file reads, and logging |
Budget enforcement and policies (max_llm_calls, max_sub_lm_calls, max_tool_calls, max_iterations, max_cost_cents, max_runtime_seconds, on_budget_exceeded) |
Ready |
trace_store callable hook |
Ready (best-effort; receives terminal RLM::Result) |
Recursive predict + depth limit |
Ready |
RLM::Lm::RubyLLM provider adapter |
Ready |
RLM::Signature::Dspy signature adapter |
Ready |
| Trace usage metadata for RubyLLM calls | Ready |
RLM::Sandbox::Subprocess |
Future milestone |
| Rails Railtie, generator, migrations, ActiveStorage adapter | Future milestone |
The table above reflects the current unreleased plain Ruby adapter implementation status.
Rails setup (intended v2 milestone)
The Rails integration is not yet implemented, but the intended setup is:
# config/initializers/rlm.rb
RLM.configure do |config|
config.root_lm = RLM::Lm::RubyLLM.new(model: Rails.application.credentials.dig(:rlm, :root_model))
config.sub_lm = RLM::Lm::RubyLLM.new(model: Rails.application.credentials.dig(:rlm, :sub_model))
config.sandbox = RLM::Sandbox::Subprocess.new # development
# config.sandbox = RLM::Sandbox::Docker.new # production (v0.4)
config.cache = Rails.cache
config.logger = Rails.logger
config.default_limits = RLM::Limits.new(
max_iterations: 8,
max_llm_calls: 25,
max_tool_calls: 20,
max_runtime_seconds: 120,
max_cost_cents: 100,
max_recursion_depth: 1
)
end
API keys belong in Rails.application.credentials, not env files. Per RubyLLM's
Rails integration, provider keys are picked up automatically when set there.
Error handling
All RLM errors inherit from RLM::Error. Rescue the parent to catch every variant, or rescue specific classes
to handle distinct failure modes.
begin
result = RLM.predict(InvoiceExtraction, input: { invoice_pdf: file })
rescue RLM::BudgetExceededError => e
# Hard limits hit: max_iterations, max_llm_calls, max_cost_cents, max_runtime_seconds.
logger.warn("RLM budget exceeded: #{e.}")
rescue RLM::ValidationError => e
# Final output failed signature validation after repair attempts were exhausted.
invoice.update!(needs_review: true, review_reasons: ["validation_failed"])
rescue RLM::SandboxError => e
# Generated code violated sandbox policy or the sandbox backend crashed.
raise
rescue RLM::ProviderError => e
# RubyLLM provider call failed (transient retries already exhausted).
raise
rescue RLM::ToolError => e
# A registered tool raised an exception or was called with invalid input.
raise
rescue RLM::ParseError => e
# Root LM response could not be parsed into <rlm-code>/<rlm-final>.
raise
rescue RLM::ConfigurationError => e
# Missing signature, missing root LM, invalid sandbox, etc.
raise
rescue RLM::Error => e
# Catch-all for any other RLM-originated failure.
raise
end
Soft failures land on result.status instead of raising. Inspect result.success?, result.needs_review?,
result.failed?, and result.validation_errors to branch. Budget handling honors limits.on_budget_exceeded:
:fail returns :budget_exceeded, :needs_review returns :needs_review, and :return_partial returns
:needs_review only when a valid submitted output already exists; otherwise it fails as :budget_exceeded.
| Status | Predicate | Meaning |
|---|---|---|
:completed |
success? |
Output valid, ready to use. |
:needs_review |
needs_review? |
Budget policy requested review, optionally with a valid submitted partial output. |
:failed_validation |
failed? |
Output invalid after validation. |
:budget_exceeded |
failed? |
Hit a hard limit with :fail, or :return_partial had no valid submitted output. |
:sandbox_error |
failed? |
Sandbox violation or crash. |
:tool_error |
failed? |
Tool raised or returned invalid output. |
:provider_error |
failed? |
RubyLLM provider failure. |
:aborted |
failed? |
Run cancelled by caller. |
Production safety
RLM::Sandbox::UnsafeInProcessexecutes generated code in the host Ruby process. It is dev/test-only and unsafe.UnsafeInProcesscaptures$stdout/$stderrby mutating process-global streams; capture is serialized with a mutex, but the sandbox remains unsuitable for production and should not be treated as concurrency-safe isolation.- The subprocess sandbox is a future milestone for local development.
- Production deployments should use a container sandbox or remote isolated runner (future milestone).
- Generated code must not execute inside the host Ruby process in production. The codebase will hold this invariant.
- Mounted files are data, not instructions; generated code should treat file contents as untrusted input.
Development
zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle install'
zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rake test'
zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rubocop'
zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rake'
Contributing
Issues and pull requests welcome at https://github.com/dpaluy/rlm.
API reference
RLM.rb integrates with these upstream libraries. For provider or signature details, go to source:
- RubyLLM, chat guide for provider, chat, token, and cost APIs.
- dspy.rb, Signatures guide for typed input/output contracts.
- The Recursive Language Models reference implementation and the DSPy RLM module for the underlying idea.
License
MIT, see LICENSE.txt.