Phronomy
Phronomy is a Ruby AI agent framework inspired by open-source AI agent frameworks.
It provides composable building blocks — Workflows, Agents, Tools, Guardrails, RAG, and Tracing — all powered by RubyLLM for LLM abstraction.
Features
Stability labels:
Stable— production-ready, semver-protected API.Beta— functional but the API may change in a minor release.Experimental— subject to breaking changes without notice.
| Feature | Stability |
|---|---|
| Workflow — Stateful, branching workflows with wait_state/send_event | Stable |
| Workflow Parallel Node — Concurrent branches via application-level threads | Beta |
| Agent — ReAct-style tool-calling agents with guardrails and conversation history | Stable |
| Before-Completion Hook — Three-tier LLM parameter injection | Stable |
| Context Management — Token budget calculation, estimation, and pruning | Stable |
| Knowledge/RAG — Retrieval sources with pluggable loaders, splitters, and vector stores | Beta |
| Multi-agent — Agent-as-Tool pattern and hub-and-spoke handoff routing | Beta |
| GeneratorVerifier — Generator-Verifier loop with injectable prompt builders/parsers | Beta |
Agent::Orchestrator — Parallel subagent dispatch, fan-out, and subagent DSL |
Beta |
| Agent::TeamCoordinator — Agent teams pattern: LLM coordinator + stateful worker pool with task queue (worker-local message history per run) | Beta |
Agent::SharedState — Shared state pattern: peer agents collaborate via a shared KnowledgeStore; member DSL with per-agent instructions and coordination team protocol |
Experimental |
| Guardrails — Input/output validation; built-in PII and prompt-injection detectors | Beta |
| Output Parser — JSON and Struct-mapped parsers for structured LLM responses | Stable |
| Eval Framework — Dataset-driven evaluation with multiple scorer types | Beta |
| Tracing — Pluggable span-based observability | Stable |
| MCP Tool — Model Context Protocol server integration | Beta |
Installation
Add to your Gemfile:
gem "phronomy"
Then run:
bundle install
Quick Start
Agent — ReAct tool-calling agent
class WebSearch < Phronomy::Tool::Base
description "Search the web"
param :query, type: :string, desc: "Search query"
def execute(query:)
# ... call a search API
end
end
class ResearchAgent < Phronomy::Agent::Base
model "gpt-4o"
instructions "You are a research assistant. Use tools to answer questions."
tools WebSearch
max_iterations 5
end
result = ResearchAgent.new.invoke("What happened in AI research this week?")
puts result[:output]
Workflow — Stateful workflow with wait_state/send_event
class ReviewContext
include Phronomy::WorkflowContext
field :draft, type: :replace
field :feedback, type: :replace
field :approved, type: :replace, default: false
end
app = Phronomy::Workflow.define(ReviewContext) do
initial :write
state :write, action: ->(s) { s.merge(draft: Writer.call(s)) }
state :review, action: ->(s) { s.merge(feedback: Reviewer.call(s.draft)) }
wait_state :awaiting_approval # halts here for human decision
state :finalize, action: ->(s) { s.merge(approved: true) }
after :write, to: :review
after :review, to: :awaiting_approval
after :finalize, to: :__finish__
event :approve, from: :awaiting_approval, to: :finalize
event :reject, from: :awaiting_approval, to: :write
end
# First run — halts at :awaiting_approval
state = app.invoke({ draft: "" }, config: { thread_id: "doc-1" })
puts "Halted: #{state.halted?}" # => true
puts "Draft: #{state.draft}"
# Resume after human approval — pass the halted state and the event name
final = app.send_event(state: state, event: :approve)
puts "Approved: #{final.approved}" # => true
Multi-Agent — Agent-as-Tool pattern
Wrap sub-agents as Tool::Base subclasses so the orchestrator LLM can call them on demand.
class ResearchTool < Phronomy::Tool::Base
description "Research a topic and return key findings as bullet points."
param :topic, type: :string, desc: "The topic to research"
def execute(topic:)
ResearchAgent.new.invoke(topic)[:output]
end
end
class WriteTool < Phronomy::Tool::Base
description "Write a technical blog post given research notes and a writing brief."
param :instructions, type: :string, desc: "Writing brief including research notes"
def execute(instructions:)
WriterAgent.new.invoke(instructions)[:output]
end
end
class OrchestratorAgent < Phronomy::Agent::Base
model "gpt-4o"
instructions "Use the research tool first, then the write tool to produce a blog post."
tools ResearchTool, WriteTool
end
result = OrchestratorAgent.new.invoke("Write a blog post about Ruby 3.4 features")
puts result[:output]
Guardrails — Input/output validation
class NoSensitiveDataGuardrail < Phronomy::Guardrail::InputGuardrail
def check(input)
fail!("Credit card numbers are not allowed") if input.match?(/\d{4}-\d{4}-\d{4}-\d{4}/)
end
end
agent = ResearchAgent.new
agent.add_input_guardrail(NoSensitiveDataGuardrail.new)
Built-in Guardrails — PII and prompt injection detection
# Detect SSNs, credit cards, emails, and phone numbers
agent.add_input_guardrail(Phronomy::Guardrail::Builtin::PIIPatternDetector.new)
# Block common prompt-injection attempts
agent.add_input_guardrail(Phronomy::Guardrail::Builtin::PromptInjectionDetector.new)
Knowledge/RAG — Context injection and vector retrieval
# Static knowledge (policy files, reference docs)
policy = Phronomy::KnowledgeSource::StaticKnowledge.new(
File.read("policy.md"),
type: :policy,
source: "policy.md" # exposed to LLM for citation
)
# RAG retrieval from a vector store
store = Phronomy::VectorStore::InMemory.new
= Phronomy::Embeddings::RubyLLMEmbeddings.new(model: "text-embedding-3-small")
rag = Phronomy::KnowledgeSource::RAGKnowledge.new(store: store, embeddings: , k: 5)
# Inject at invocation time
result = MyAgent.new.invoke("What is the refund policy?",
config: { knowledge_sources: [policy, rag] })
Load and split documents with built-in loaders:
chunks = Phronomy::Loader::MarkdownLoader.new.load("docs/guide.md")
.then { |docs| Phronomy::Splitter::RecursiveSplitter.new(chunk_size: 512).split(docs) }
Multi-Agent Handoff — Hub-and-spoke routing
triage = TriageAgent.new
billing = BillingAgent.new
support = SupportAgent.new
runner = Phronomy::Agent::Runner.new(
agents: [triage, billing, support],
routes: { triage => [billing, support] }
)
result = runner.invoke("I need help with my invoice")
puts result[:output] # final answer
puts result[:agent].class # => BillingAgent
Before-Completion Hook — Dynamic LLM parameter injection
# Class-level: applies to all instances
class MyAgent < Phronomy::Agent::Base
model "gpt-4o"
before_completion ->(ctx) { { temperature: ctx.config[:precise] ? 0.0 : 0.7 } }
end
# Instance-level: overrides class hook for this agent only
agent = MyAgent.new
agent.before_completion = ->(ctx) { { max_tokens: 512 } }
# Global: applies to every agent across the app
Phronomy.configure do |c|
c.before_completion = ->(ctx) { { temperature: 0.3 } }
end
Hooks are called in order — global → class → instance — and deep-merged.
GeneratorVerifier — Generator-Verifier loop with custom prompt builders
pipeline = Phronomy::GeneratorVerifier.new(
draft_agent: PolicyDraftAgent,
review_agent: PolicyReviewAgent,
# Full control over the LLM dialogue — supply your own prompts.
draft_prompt_builder: ->(input, feedback) {
base = "Answer precisely: #{input}"
feedback ? "#{base}\n\nPrevious feedback: #{feedback}" : base
},
review_prompt_builder: ->(input, draft, citations) {
"Is this draft accurate? Draft: #{draft}"
},
confidence_threshold: 0.7,
max_iterations: 3,
raise_if_untrusted: false # set true to raise LowConfidenceError
)
result = pipeline.invoke("What is the refund policy?")
puts result.output # final answer
puts result.trusted? # true when confidence >= 0.7
puts result.confidence # Float 0.0–1.0
result.citations.each { |c| puts "#{c[:source]}: #{c[:excerpt]}" }
Optionally inject a custom result parser to decode non-JSON LLM output:
pipeline = Phronomy::GeneratorVerifier.new(
# ... (required params as shown above)
draft_result_parser: ->(text) { my_custom_draft_parser(text) },
review_result_parser: ->(text) { my_custom_review_parser(text) }
)
Raise on low confidence:
begin
result = pipeline.invoke("question")
rescue Phronomy::LowConfidenceError => e
puts "Untrusted (confidence #{e.result.confidence}): #{e.result.output}"
end
Agent::Orchestrator — Parallel subagent dispatch
Note:
dispatch_parallelandfan_outuse plain Ruby threads and are intended for small-scale fan-out (a handful of subagents). For large-scale parallel dispatch, manage concurrency (thread pools, rate limiting) at the application level.
class ResearchOrchestrator < Phronomy::Agent::Orchestrator
model "gpt-4o"
instructions "Coordinate research tasks by dispatching to specialised agents."
# Each subagent is automatically exposed as an LLM-callable tool.
subagent :searcher, SearchAgent
subagent :summarizer, SummaryAgent, on_error: :skip
end
result = ResearchOrchestrator.new.invoke("Research the latest AI news.")
Programmatic parallel dispatch (no LLM loop):
class MyOrchestrator < Phronomy::Agent::Orchestrator
model "gpt-4o"
instructions "Orchestrate."
def run(query)
# Heterogeneous agents in parallel
results = dispatch_parallel(
{agent: SearchAgent, input: "topic A"},
{agent: AnalysisAgent, input: query}
)
# Fan-out — same agent, multiple inputs
translations = fan_out(agent: TranslationAgent, inputs: %w[Hello World])
results.map { |r| r[:output] }.join("\n")
end
end
Workflow Parallel Node — Concurrent branches
Phronomy does not provide a built-in parallel abstraction. Use application-level Ruby threads inside a state action:
class EnrichContext
include Phronomy::WorkflowContext
field :summary, type: :replace
field :tags, type: :append, default: -> { [] }
end
app = Phronomy::Workflow.define(EnrichContext) do
initial :enrich
state :enrich, action: ->(s) do
results = {}
threads = [
Thread.new { results[:summary] = Summarizer.call(s) },
Thread.new { results[:tags] = Tagger.call(s) }
]
threads.each { |t| t.join(10) } # 10-second timeout
s.merge(summary: results[:summary], tags: Array(results[:tags]))
end
after :enrich, to: :__finish__
end
state = app.invoke({}, config: { thread_id: "t1" })
Output Parser — Structured LLM responses
# Extract JSON from LLM output (handles Markdown code fences automatically)
parser = Phronomy::OutputParser::JsonParser.new
data = parser.parse('```json\n{"name":"Alice","score":0.9}\n```')
# => { name: "Alice", score: 0.9 }
# Map JSON directly to a Struct
PersonSchema = Struct.new(:name, :age, keyword_init: true)
parser = Phronomy::OutputParser::StructuredParser.new(PersonSchema)
person = parser.parse('{"name":"Alice","age":30}')
# => #<struct PersonSchema name="Alice", age=30>
Eval Framework — Dataset-driven quality evaluation
dataset = Phronomy::Eval::Dataset.from_array([
{ input: "Capital of France?", expected: "Paris" },
{ input: "Capital of Japan?", expected: "Tokyo" }
])
agent = MyGeographyAgent.new
runner = Phronomy::Eval::Runner.new(
scorer: Phronomy::Eval::Scorer::LlmJudge.new(model: "gpt-4o-mini")
)
results = runner.run(dataset, ->(q) { agent.invoke(q) })
metrics = Phronomy::Eval::Metrics.new(results)
puts "Mean score: #{metrics.mean_score}" # Float 0.0–1.0
puts "Pass rate: #{metrics.pass_rate}" # fraction with score >= threshold
Tracing — Custom observability
Phronomy.configure do |c|
c.tracer = MyCustomTracer.new # any Phronomy::Tracing::Base subclass
end
MCP Tool — External tool servers
search_tool = Phronomy::Tool::McpTool.from_server(
"stdio://./mcp-server",
tool_name: "web_search"
)
Call close when the tool is no longer needed to shut down the underlying
child process (stdio transport) or release the HTTP connection:
search_tool.close
Conversation History — passing prior messages
Phronomy does not manage conversation history internally. The application owns the
message array and passes it in via the messages: keyword argument:
# First turn
result1 = MyAgent.new.invoke("Hello! I'm Alice.", thread_id: "session-1")
= result1[:messages] # Array<RubyLLM::Message>
# Second turn — pass prior messages so the agent has context
result2 = MyAgent.new.invoke(
"What is my name?",
messages: ,
thread_id: "session-1"
)
puts result2[:output] # => "Your name is Alice."
result[:messages] contains the complete message history after each invocation.
Persist it however suits your application (in-memory hash, Redis, ActiveRecord, etc.).
Configuration
Phronomy.configure do |c|
c.default_model = "gpt-4o-mini"
c.recursion_limit = 25
c.tracer = Phronomy::Tracing::NullTracer.new
c.before_completion = nil # optional; global hook lambda
end
Context Management
Phronomy includes a context window management layer so agents automatically stay within the token limits of the underlying model.
TokenBudget
Derives the effective token budget from RubyLLM's model registry:
budget = Phronomy::Context::TokenBudget.new(
model: "claude-3-5-sonnet-20241022", # looks up context_window + max_output_tokens
overhead: 500 # extra reservation for tool definitions
)
budget.context_window # => 200_000
budget.max_output_tokens # => 8_192
budget.effective_input_limit # => 191_308
Or supply explicit values (useful for local / unregistered models):
budget = Phronomy::Context::TokenBudget.new(
context_window: 32_768,
max_output_tokens: 4_096
)
Agent DSL extensions
class MyAgent < Phronomy::Agent::Base
model "gpt-4o"
max_output_tokens 4096 # override max_output_tokens from registry
context_overhead 600 # extra reservation for system prompt + tools
end
Agent::Base#invoke builds a TokenBudget automatically. When the model is not in the
registry the budget is silently skipped.
Examples
Runnable examples covering all major features are available in the phronomy-examples repository.
Each example lives in its own numbered directory and can be run with:
bundle exec ruby NN_example_name/run.rb
| # | Directory | What it demonstrates |
|---|---|---|
| 01 | 01_basic_chain/ |
PromptTemplate → LLMChain pipeline |
| 02 | 02_react_agent/ |
ReAct tool-calling agent |
| 03 | 03_state_graph/ |
Stateful workflow with wait_state/send_event |
| 04 | 04_interrupt_resume/ |
Human-in-the-loop wait_state and resume |
| 05 | 05_multi_agent/ |
Multi-agent coordination via Agent-as-Tool |
| 06 | 06_guardrails/ |
Input/output guardrails |
| 07 | 07_tracing/ |
Custom observability with Langfuse tracer |
| 08 | 08_mcp_tool/ |
MCP tool integration |
| 10 | 10_context_management/ |
Token budget and context pruning |
| 11 | 11_agent_streaming/ |
Streaming agent responses |
| 12 | 12_prompt_template/ |
Advanced prompt templates |
| 13 | 13_mcp_http_tool/ |
HTTP-based MCP tool server |
| 14 | 14_code_review/ |
Automated code review agent |
| 16 | 16_before_completion_hook/ |
Global/class/instance before_completion hooks |
| 17 | 17_multi_agent_handoff/ |
Hub-and-spoke agent routing via Runner |
The following examples are app-level demos (Rails apps or advanced pipelines) that require additional infrastructure (a running Rails server, database, etc.):
| # | Directory | What it demonstrates |
|---|---|---|
| 09 | 09_rails_chat/ |
Rails chat app with ActionCable streaming |
| 15 | 15_rails_secure_chat/ |
Rails chat with PII guardrails |
| 18 | 18_rails_agent_job/ |
Rails app with AgentJob + ActionCable streaming |
| 19 | 19_trust_pipeline/ |
Generator-Verifier pattern with citation tracking, self-review loop and confidence gate |
Development
After checking out the repo, install dependencies:
bin/setup
Run the unit test suite:
bundle exec rspec spec/phronomy
Run the integration tests (requires a running LLM endpoint):
bundle exec rspec spec/integration --tag integration
Launch an interactive console:
bin/console
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/Raizo-TCS/phronomy.
License
The gem is available as open source under the terms of the MIT License.