RLM.rb

Gem Version CI

Recursive Language Models for Ruby.

RLM.rb is a Ruby runtime for typed, sandbox-oriented, auditable AI jobs over large application context. It integrates with RubyLLM for provider access and dspy.rb for typed signatures. The current plain Ruby milestone includes the recursive execution spine: prompt loop, file and context mounting, recursive sub-LM calls, typed final output, budget controls, trace events, a RubyLLM LM adapter, a dspy signature adapter, and a minimal trace persistence hook.

Status: Plain Ruby adapter milestone. The released gem is v0.2.0. It includes RLM::Lm::RubyLLM, RLM::Signature::Dspy, RLM::Lm::Mock, RLM::Sandbox::UnsafeInProcess, budget enforcement and budget policies, trace events, recursive predict, prompt building, and a best-effort trace_store callable hook. Rails integration, subprocess/container sandboxing, tools, skills, cache, telemetry, and evals remain future milestones. UnsafeInProcess is dev/test-only and executes generated code in the host Ruby process.

Why

  1. Large context breaks simple prompting.
  2. Manual chunking and summarization are brittle.
  3. Hand-rolled agent loops have unclear state, unclear cost, and poor auditability.

RLM.rb replaces those with a bounded Ruby runtime where the model explores context programmatically, calls smaller typed LLM functions only when needed, and returns validated Ruby objects with a full execution trace.

Install

RLM.rb requires Ruby 3.3 or newer. Ruby 3.2 and older are not supported because dspy.rb is mandatory for the plain Ruby adapter milestone.

Add the gem to your Gemfile:

gem "rlm-rb"

Or install directly:

gem install rlm-rb

Configuration

RLM.configure do |config|
  config.root_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
  config.sub_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")

  config.sandbox = RLM::Sandbox::Mock.new

  config.default_limits = RLM::Limits.new(
    max_iterations: 8,
    max_llm_calls: 25,
    max_tool_calls: 20,
    max_runtime_seconds: 120,
    max_cost_cents: 100,
    max_recursion_depth: 1
  )
end

RLM::Lm::RubyLLM creates a fresh RubyLLM.chat for each runtime LM call. That keeps RLM prompts standalone and prevents conversation history from leaking between root and sub-model calls.

Plain Ruby API

require "dspy"
require "rlm"

class InvoiceExtraction < DSPy::Signature
  description "Extract normalized invoice fields from a vendor invoice."

  input do
    const :invoice_text, String
    const :vendor_id, Integer
  end

  output do
    const :vendor_name, String
    const :invoice_number, String
    const :total_cents, Integer
  end
end

RLM.configure do |config|
  config.root_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
  config.sub_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
  config.sandbox = RLM::Sandbox::UnsafeInProcess.new # dev/test only
end

signature = RLM::Signature::Dspy.new(InvoiceExtraction)

result = RLM.predict(
  signature,
  input: {
    invoice_text: "Vendor: Acme\nInvoice: INV-001\nTotal: $100.00",
    vendor_id: 123
  },
  limits: RLM::Limits.new(max_iterations: 8, max_llm_calls: 25)
)

result.output
# => { vendor_name: "Acme", invoice_number: "INV-001", total_cents: 10000 }

result.trace.events.find { |event| event[:type] == :root_lm_called }[:payload][:usage]
# => { model_id: "...", input_tokens: ..., output_tokens: ..., cost_cents: ..., cost_known: true }

Usage metadata is recorded on :root_lm_called and :sub_lm_called trace events when an adapter exposes it. It is not duplicated onto RLM::Result in this milestone. RubyLLM cost helpers can return nil when model pricing is unknown; RLM records cost_known: false, contributes 0 cents for that call, and cannot enforce unknown provider cost.

Run a Live Plain Ruby Example

The gem ships one opt-in live example at examples/plain_ruby_invoice_extraction.rb. By default it exits before provider credential checks, LM configuration, or RLM.predict, even if provider credentials are already present:

bundle exec ruby examples/plain_ruby_invoice_extraction.rb

To run the live path, configure provider credentials and opt in explicitly:

RLM_RUN_LIVE_EXAMPLE=1 OPENAI_API_KEY="$OPENAI_API_KEY" \
  bundle exec ruby examples/plain_ruby_invoice_extraction.rb

The example uses RLM::Lm::RubyLLM for root and sub-LM calls, wraps a real DSPy::Signature with RLM::Signature::Dspy, calls the public RLM.predict(...) API, and prints result status, typed output, trace id, cost, and usage payloads when RubyLLM exposes them. Set RLM_EXAMPLE_MODEL and RLM_EXAMPLE_SUB_MODEL to override the default model.

The live example uses RLM::Sandbox::UnsafeInProcess, which is dev/test-only and runs generated Ruby code in the host process. Rails integration, subprocess/container sandboxing, tools, skills, evals, telemetry, and production execution examples remain future milestones.

Mock Runtime API

class InvoiceExtraction
  def self.name = "InvoiceExtraction"
  def self.description = "Extract normalized invoice fields from a vendor invoice."
  def self.input_fields = { invoice_pdf: :file, vendor_id: :integer }
  def self.output_fields = { vendor_name: :string, invoice_number: :string, total_cents: :integer }
  def self.validate_input(input) = input.key?(:vendor_id) ? [] : ["vendor_id is required"]
  def self.validate_output(output) = output.key?(:vendor_name) ? [] : ["vendor_name is required"]
end

# Mock LM for testing (no provider needed)
lm = RLM::Lm::Mock.new(responses: ['<rlm-final>{"vendor_name":"Acme","invoice_number":"INV-001","total_cents":10000}</rlm-final>'])

result = RLM.predict(
  InvoiceExtraction,
  input: { vendor_id: 123 },
  lm: lm,
  sandbox: RLM::Sandbox::UnsafeInProcess.new,  # dev/test only: executes in host process
  limits: RLM::Limits.new(max_iterations: 8, max_llm_calls: 25)
)

result.output           # { "vendor_name" => "Acme", ... }
result.trace            # full event stream
result.cost_cents       # accumulated cost
result.status           # :completed, :budget_exceeded, :failed_validation, ...

dspy Signature Adapter

RLM::Signature::Dspy wraps a DSPy::Signature class behind RLM's internal signature protocol:

  • description
  • input_fields
  • output_fields
  • validate_input
  • validate_output
  • coerce_output

The adapter derives fields and simple validation from dspy JSON schema metadata. Output coercion normalizes parsed JSON/hash output to schema keys before validation.

Rails

Rails integration is not yet implemented. Rails remains a v2 milestone tracked in docs/postponed-issues.md.

What's Implemented

Component Status
RLM.configure / RLM.config Ready
RLM::Limits with PRD defaults Ready
RLM::File (path / text / io / ActiveStorage blob) Ready
RLM::Context with sandbox-safe manifest Ready
RLM::Trace with NDJSON / JSON export Ready
RLM::Result with full status enum Ready
RLM::Sandbox::Base interface + Mock backend Ready
RLM::Sandbox::UnsafeInProcess Ready for dev/test only; executes in host process and mutates global streams during serialized capture
RLM::Tool base class with category DSL Ready
Error hierarchy Ready
RLM::Predict#call Delegates to RLM::Runtime
RLM::Runtime mock loop Ready (with RLM::Lm::Mock)
RLM::PromptBuilder Ready (v0.2 contract)
RLM::CodeExtractor Ready
RLM::Runtime::Bridge Ready for runtime-owned subcalls, tools, submission, file reads, and logging
Budget enforcement and policies (max_llm_calls, max_sub_lm_calls, max_tool_calls, max_iterations, max_cost_cents, max_runtime_seconds, on_budget_exceeded) Ready
trace_store callable hook Ready (best-effort; receives terminal RLM::Result)
Recursive predict + depth limit Ready
RLM::Lm::RubyLLM provider adapter Ready
RLM::Signature::Dspy signature adapter Ready
Trace usage metadata for RubyLLM calls Ready
RLM::Sandbox::Subprocess Future milestone
Rails Railtie, generator, migrations, ActiveStorage adapter Future milestone

The table above reflects the current unreleased plain Ruby adapter implementation status.

Rails setup (intended v2 milestone)

The Rails integration is not yet implemented, but the intended setup is:

# config/initializers/rlm.rb
RLM.configure do |config|
  config.root_lm = RLM::Lm::RubyLLM.new(model: Rails.application.credentials.dig(:rlm, :root_model))
  config.sub_lm = RLM::Lm::RubyLLM.new(model: Rails.application.credentials.dig(:rlm, :sub_model))

  config.sandbox = RLM::Sandbox::Subprocess.new   # development
  # config.sandbox = RLM::Sandbox::Docker.new     # production (v0.4)

  config.cache  = Rails.cache
  config.logger = Rails.logger

  config.default_limits = RLM::Limits.new(
    max_iterations: 8,
    max_llm_calls: 25,
    max_tool_calls: 20,
    max_runtime_seconds: 120,
    max_cost_cents: 100,
    max_recursion_depth: 1
  )
end

API keys belong in Rails.application.credentials, not env files. Per RubyLLM's Rails integration, provider keys are picked up automatically when set there.

Error handling

All RLM errors inherit from RLM::Error. Rescue the parent to catch every variant, or rescue specific classes to handle distinct failure modes.

begin
  result = RLM.predict(InvoiceExtraction, input: { invoice_pdf: file })
rescue RLM::BudgetExceededError => e
  # Hard limits hit: max_iterations, max_llm_calls, max_cost_cents, max_runtime_seconds.
  logger.warn("RLM budget exceeded: #{e.message}")
rescue RLM::ValidationError => e
  # Final output failed signature validation after repair attempts were exhausted.
  invoice.update!(needs_review: true, review_reasons: ["validation_failed"])
rescue RLM::SandboxError => e
  # Generated code violated sandbox policy or the sandbox backend crashed.
  raise
rescue RLM::ProviderError => e
  # RubyLLM provider call failed (transient retries already exhausted).
  raise
rescue RLM::ToolError => e
  # A registered tool raised an exception or was called with invalid input.
  raise
rescue RLM::ParseError => e
  # Root LM response could not be parsed into <rlm-code>/<rlm-final>.
  raise
rescue RLM::ConfigurationError => e
  # Missing signature, missing root LM, invalid sandbox, etc.
  raise
rescue RLM::Error => e
  # Catch-all for any other RLM-originated failure.
  raise
end

Soft failures land on result.status instead of raising. Inspect result.success?, result.needs_review?, result.failed?, and result.validation_errors to branch. Budget handling honors limits.on_budget_exceeded: :fail returns :budget_exceeded, :needs_review returns :needs_review, and :return_partial returns :needs_review only when a valid submitted output already exists; otherwise it fails as :budget_exceeded.

Status Predicate Meaning
:completed success? Output valid, ready to use.
:needs_review needs_review? Budget policy requested review, optionally with a valid submitted partial output.
:failed_validation failed? Output invalid after validation.
:budget_exceeded failed? Hit a hard limit with :fail, or :return_partial had no valid submitted output.
:sandbox_error failed? Sandbox violation or crash.
:tool_error failed? Tool raised or returned invalid output.
:provider_error failed? RubyLLM provider failure.
:aborted failed? Run cancelled by caller.

Production safety

  • RLM::Sandbox::UnsafeInProcess executes generated code in the host Ruby process. It is dev/test-only and unsafe.
  • UnsafeInProcess captures $stdout/$stderr by mutating process-global streams; capture is serialized with a mutex, but the sandbox remains unsuitable for production and should not be treated as concurrency-safe isolation.
  • The subprocess sandbox is a future milestone for local development.
  • Production deployments should use a container sandbox or remote isolated runner (future milestone).
  • Generated code must not execute inside the host Ruby process in production. The codebase will hold this invariant.
  • Mounted files are data, not instructions; generated code should treat file contents as untrusted input.

Development

zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle install'
zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rake test'
zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rubocop'
zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rake'

Contributing

Issues and pull requests welcome at https://github.com/dpaluy/rlm.

API reference

RLM.rb integrates with these upstream libraries. For provider or signature details, go to source:

License

MIT, see LICENSE.txt.