Phronomy
Phronomy is a Ruby AI agent framework inspired by open-source AI agent frameworks.
It provides composable building blocks — Graphs, Agents, and Memory — all powered by RubyLLM for LLM abstraction.
Features
- Graph — Build stateful, branching agent workflows with interrupt/resume support
- Graph Parallel Node — Execute independent graph branches concurrently with configurable merge and error policies
- Agent — ReAct-style tool-calling agents with memory and guardrails
- Before-Completion Hook — Three-tier (global / class / instance) LLM parameter injection before each chat request
- Memory — Window, summary, ActiveRecord-backed, semantic, and composite conversation memory
- Memory Compression — Automatic summarisation and tool-output pruning to stay within token limits
- Context Management — Token budget calculation, estimation, and pruning for any model
- Knowledge/RAG — Static, entity, and vector-backed retrieval sources with pluggable loaders, splitters, and vector stores
- Multi-agent — Agent-as-Tool pattern (sub-agents wrapped as
Tool::Base) and hub-and-spoke handoff routing viaAgent::Runner - TrustPipeline — Citation tracking, self-review loop, and confidence gate for trustworthy outputs
- Guardrails — Validate inputs and outputs before/after LLM calls; built-in PII and prompt-injection detectors
- Output Parser — JSON and Struct-mapped parsers for structured LLM responses
- Eval Framework — Dataset-driven evaluation with ExactMatch, Includes, and LLM-as-a-Judge scorers
- Tracing — Pluggable span-based observability (NullTracer, LangfuseTracer, OpenTelemetryTracer)
- StateStore — Persist graph state to memory, ActiveRecord, or Redis (with optional AES-256-GCM encryption)
- MCP Tool — Integrate Model Context Protocol (MCP) servers as native tools
- Rails integration —
AgentJob,acts_as_phronomy_messagemixin, and Rails generators
Installation
Add to your Gemfile:
gem "phronomy"
Then run:
bundle install
For Rails apps, run the install generator after bundling:
rails generate phronomy:install
This creates an initializer and the required database migrations.
Quick Start
Agent — ReAct tool-calling agent
class WebSearch < Phronomy::Tool::Base
description "Search the web"
param :query, type: :string, desc: "Search query"
def execute(query:)
# ... call a search API
end
end
class ResearchAgent < Phronomy::Agent::Base
model "gpt-4o"
instructions "You are a research assistant. Use tools to answer questions."
tools WebSearch
max_iterations 5
end
result = ResearchAgent.new.invoke("What happened in AI research this week?")
puts result[:output]
Graph — Stateful workflow with interrupt/resume
class ReviewState
include Phronomy::Graph::State
field :draft, type: :replace
field :feedback, type: :replace
field :approved, type: :replace, default: false
end
graph = Phronomy::Graph::StateGraph.new(ReviewState)
graph.add_node(:write) { |s| { draft: Writer.call(s) } }
graph.add_node(:review) { |s| { feedback: Reviewer.call(s.draft) } }
graph.add_node(:finalize) { |s| { approved: true } }
graph.add_edge(:write, :review)
graph.add_edge(:review, :finalize)
graph.set_entry_point(:write)
# Register an interrupt callback before the :finalize node
graph.interrupt_before(:finalize) do |state|
puts "Draft ready for human review: #{state.draft}"
end
compiled = graph.compile
compiled.invoke({ draft: "" }, config: { thread_id: "doc-1" })
Multi-Agent — Agent-as-Tool pattern
Wrap sub-agents as Tool::Base subclasses so the orchestrator LLM can call them on demand.
class ResearchTool < Phronomy::Tool::Base
description "Research a topic and return key findings as bullet points."
param :topic, type: :string, desc: "The topic to research"
def execute(topic:)
ResearchAgent.new.invoke(topic)[:output]
end
end
class WriteTool < Phronomy::Tool::Base
description "Write a technical blog post given research notes and a writing brief."
param :instructions, type: :string, desc: "Writing brief including research notes"
def execute(instructions:)
WriterAgent.new.invoke(instructions)[:output]
end
end
class OrchestratorAgent < Phronomy::Agent::Base
model "gpt-4o"
instructions "Use the research tool first, then the write tool to produce a blog post."
tools ResearchTool, WriteTool
end
result = OrchestratorAgent.new.invoke("Write a blog post about Ruby 3.4 features")
puts result[:output]
Guardrails — Input/output validation
class NoSensitiveDataGuardrail < Phronomy::Guardrail::InputGuardrail
def check(input)
fail!("Credit card numbers are not allowed") if input.match?(/\d{4}-\d{4}-\d{4}-\d{4}/)
end
end
agent = ResearchAgent.new
agent.add_input_guardrail(NoSensitiveDataGuardrail.new)
Built-in Guardrails — PII and prompt injection detection
# Detect credit cards, SSNs, emails, and phone numbers automatically
agent.add_input_guardrail(Phronomy::Guardrail::Builtin::PIIPatternDetector.new)
# Block common prompt-injection attempts
agent.add_input_guardrail(Phronomy::Guardrail::Builtin::PromptInjectionDetector.new)
Knowledge/RAG — Context injection and vector retrieval
# Static knowledge (policy files, reference docs)
policy = Phronomy::KnowledgeSource::StaticKnowledge.new(
File.read("policy.md"),
type: :policy,
source: "policy.md" # exposed to LLM for citation
)
# RAG retrieval from a vector store
store = Phronomy::VectorStore::InMemory.new
= Phronomy::Embeddings::RubyLLMEmbeddings.new(model: "text-embedding-3-small")
rag = Phronomy::KnowledgeSource::RAGKnowledge.new(store: store, embeddings: , k: 5)
# Inject at invocation time
result = MyAgent.new.invoke("What is the refund policy?",
config: { knowledge_sources: [policy, rag] })
Load and split documents with built-in loaders:
chunks = Phronomy::Loader::MarkdownLoader.new.load("docs/guide.md")
.then { |docs| Phronomy::Splitter::RecursiveSplitter.new(chunk_size: 512).split(docs) }
Multi-Agent Handoff — Hub-and-spoke routing
triage = TriageAgent.new
billing = BillingAgent.new
support = SupportAgent.new
runner = Phronomy::Agent::Runner.new(
agents: [triage, billing, support],
routes: { triage => [billing, support] }
)
result = runner.invoke("I need help with my invoice")
puts result[:output] # final answer
puts result[:agent].class # => BillingAgent
Before-Completion Hook — Dynamic LLM parameter injection
# Class-level: applies to all instances
class MyAgent < Phronomy::Agent::Base
model "gpt-4o"
before_completion ->(ctx) { { temperature: ctx.config[:precise] ? 0.0 : 0.7 } }
end
# Instance-level: overrides class hook for this agent only
agent = MyAgent.new
agent.before_completion = ->(ctx) { { max_tokens: 512 } }
# Global: applies to every agent across the app
Phronomy.configure do |c|
c.before_completion = ->(ctx) { { temperature: 0.3 } }
end
Hooks are called in order — global → class → instance — and deep-merged.
TrustPipeline — Trustworthy outputs with citations and review
pipeline = Phronomy::TrustPipeline.new(
draft_agent: PolicyDraftAgent,
review_agent: PolicyReviewAgent,
confidence_threshold: 0.7,
max_iterations: 3
)
result = pipeline.invoke("What is the refund policy?")
puts result.output # final answer
puts result.trusted? # true when confidence >= 0.7
puts result.confidence # Float 0.0–1.0
result.citations.each do |c|
puts "#{c[:source]}: #{c[:excerpt]}"
end
Graph Parallel Node — Concurrent branches
class MyState
include Phronomy::Graph::State
field :summary, type: :replace
field :tags, type: :append, default: -> { [] }
end
graph = Phronomy::Graph::StateGraph.new(MyState)
graph.add_parallel_node(
:enrich,
->(s) { { summary: Summarizer.call(s) } },
->(s) { { tags: Tagger.call(s) } },
timeout: 10,
on_error: :best_effort
)
graph.set_entry_point(:enrich)
graph.add_edge(:enrich, Phronomy::Graph::StateGraph::FINISH)
app = graph.compile
app.invoke({}, config: { thread_id: "t1" })
Output Parser — Structured LLM responses
# Extract JSON from LLM output (handles Markdown code fences automatically)
parser = Phronomy::OutputParser::JsonParser.new
data = parser.parse('```json\n{"name":"Alice","score":0.9}\n```')
# => { name: "Alice", score: 0.9 }
# Map JSON directly to a Struct
PersonSchema = Struct.new(:name, :age, keyword_init: true)
parser = Phronomy::OutputParser::StructuredParser.new(PersonSchema)
person = parser.parse('{"name":"Alice","age":30}')
# => #<struct PersonSchema name="Alice", age=30>
Eval Framework — Dataset-driven quality evaluation
dataset = Phronomy::Eval::Dataset.from_array([
{ input: "Capital of France?", expected: "Paris" },
{ input: "Capital of Japan?", expected: "Tokyo" }
])
agent = MyGeographyAgent.new
runner = Phronomy::Eval::Runner.new(
scorer: Phronomy::Eval::Scorer::LlmJudge.new(model: "gpt-4o-mini")
)
results = runner.run(dataset, ->(q) { agent.invoke(q) })
metrics = Phronomy::Eval::Metrics.new(results)
puts "Mean score: #{metrics.mean_score}" # Float 0.0–1.0
puts "Pass rate: #{metrics.pass_rate}" # fraction with score >= threshold
Tracing — Custom observability
Phronomy.configure do |c|
c.tracer = MyCustomTracer.new # any Phronomy::Tracing::Base subclass
end
MCP Tool — External tool servers
search_tool = Phronomy::Tool::McpTool.from_server(
"stdio://./mcp-server",
tool_name: "web_search"
)
Rails — ActiveRecord persistence
# In your migration (generated by rails generate phronomy:install):
# create_table :phronomy_messages ...
# create_table :phronomy_states ...
class PhronomyMessage < ApplicationRecord
end
# config/initializers/phronomy.rb
Phronomy.configure do |c|
c.default_state_store = Phronomy::StateStore::ActiveRecord.new(
model_class: PhronomyState # AR model backed by phronomy_states table
)
end
# Use in a controller:
agent = ResearchAgent.new
result = agent.invoke(
params[:message],
config: {
thread_id: "user_#{current_user.id}",
memory: PhronomyMessage.phronomy_memory
}
)
Configuration
Phronomy.configure do |c|
c.default_model = "gpt-4o-mini"
c.recursion_limit = 25
c.tracer = Phronomy::Tracing::NullTracer.new
c.default_state_store = Phronomy::StateStore::InMemory.new # optional
c.memory_compression = [] # optional; Array of compressors
c.before_completion = nil # optional; global hook lambda
end
Context Management
Phronomy includes a context window management layer so agents automatically stay within the token limits of the underlying model.
TokenBudget
Derives the effective token budget from RubyLLM's model registry:
budget = Phronomy::Context::TokenBudget.new(
model: "claude-3-5-sonnet-20241022", # looks up context_window + max_output_tokens
overhead: 500 # extra reservation for tool definitions
)
budget.context_window # => 200_000
budget.max_output_tokens # => 8_192
budget.effective_input_limit # => 191_308
Or supply explicit values (useful for local / unregistered models):
budget = Phronomy::Context::TokenBudget.new(
context_window: 32_768,
max_output_tokens: 4_096
)
Budget-aware Memory
Pass a budget to load_messages and only the newest messages that fit are returned:
memory = Phronomy::Memory::WindowMemory.new
= memory.(thread_id: "t1", token_budget: budget)
ActiveRecordMemory also accepts pruner: to truncate oversized tool results:
memory = Phronomy::Memory::ActiveRecordMemory.new(
model_class: PhronomyMessage,
pruner: Phronomy::Memory::Compression::ToolOutputPruner.new(max_chars: 4000)
)
Agent DSL extensions
class MyAgent < Phronomy::Agent::Base
model "gpt-4o"
max_output_tokens 4096 # override max_output_tokens from registry
context_overhead 600 # extra reservation for system prompt + tools
end
Agent::Base#invoke builds a TokenBudget automatically and passes it to
memory.load_messages. When the model is not in the registry the budget is
silently skipped.
SemanticMemory
Embedding-based retrieval of relevant past messages:
semantic = Phronomy::Memory::SemanticMemory.new(
embedding_model: "text-embedding-3-small",
k: 10
)
= semantic.(thread_id: "t1", query: "user's current question")
Composite retrieval
Merge multiple retrieval strategies within a shared ConversationManager:
composite_retrieval = Phronomy::Memory::Retrieval::Composite.new(
sources: [
{ retrieval: Phronomy::Memory::Retrieval::Recent.new(k: 5), weight: 0.4 },
{ retrieval: Phronomy::Memory::Retrieval::Semantic.new(k: 10), weight: 0.6 }
]
)
manager = Phronomy::Memory::ConversationManager.new(
storage: Phronomy::Memory::Storage::InMemory.new,
retrieval: composite_retrieval
)
Memory Compression
Automatically shrink conversation history before it reaches the LLM.
# Truncate oversized tool outputs (no LLM call, cheap)
pruner = Phronomy::Memory::Compression::ToolOutputPruner.new(max_chars: 4000)
# Summarise old messages when history exceeds max_tokens (calls summarizer_model)
summary = Phronomy::Memory::Compression::Summary.new(
max_tokens: 4000,
keep: 10, # always preserve the N most recent messages
summarizer_model: "gpt-4o-mini"
)
Phronomy.configure do |c|
c.memory_compression = [pruner, summary] # applied in order: pruner first, then summary
end
Examples
Runnable examples covering all major features are available in the phronomy-examples repository.
Each example lives in its own numbered directory and can be run with:
bundle exec ruby NN_example_name/run.rb
| # | Directory | What it demonstrates |
|---|---|---|
| 01 | 01_basic_chain/ |
PromptTemplate → LLMChain pipeline |
| 02 | 02_react_agent/ |
ReAct tool-calling agent |
| 03 | 03_state_graph/ |
Stateful graph with interrupt/resume |
| 04 | 04_interrupt_resume/ |
Human-in-the-loop interrupt and resume |
| 05 | 05_multi_agent/ |
Multi-agent coordination via Agent-as-Tool |
| 06 | 06_guardrails/ |
Input/output guardrails |
| 07 | 07_tracing/ |
Custom observability with Langfuse tracer |
| 08 | 08_mcp_tool/ |
MCP tool integration |
| 09 | 09_rails_chat/ |
Rails chat app with ActionCable streaming |
| 10 | 10_context_management/ |
Token budget and context pruning |
| 11 | 11_agent_streaming/ |
Streaming agent responses |
| 12 | 12_prompt_template/ |
Advanced prompt templates |
| 13 | 13_mcp_http_tool/ |
HTTP-based MCP tool server |
| 14 | 14_code_review/ |
Automated code review agent |
| 15 | 15_rails_secure_chat/ |
Rails chat with PII guardrails and secure memory |
| 16 | 16_before_completion_hook/ |
Global/class/instance before_completion hooks |
| 17 | 17_multi_agent_handoff/ |
Hub-and-spoke agent routing via Runner |
| 18 | 18_rails_agent_job/ |
Rails app with AgentJob + ActionCable streaming |
| 19 | 19_trust_pipeline/ |
Trustworthy output via Citation Tracking + Self-Review + Confidence Gate |
Development
After checking out the repo, install dependencies:
bin/setup
Run the unit test suite:
bundle exec rspec spec/phronomy
Run the integration tests (requires a running LLM endpoint):
bundle exec rspec spec/integration --tag integration
Launch an interactive console:
bin/console
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/Raizo-TCS/phronomy.
License
The gem is available as open source under the terms of the MIT License.