Phronomy

Phronomy is a Ruby AI agent framework inspired by open-source AI agent frameworks.
It provides composable building blocks — Graphs, Agents, and Memory — all powered by RubyLLM for LLM abstraction.

Features

Graph — Build stateful, branching agent workflows with interrupt/resume support
Graph Parallel Node — Execute independent graph branches concurrently with configurable merge and error policies
Agent — ReAct-style tool-calling agents with memory and guardrails
Before-Completion Hook — Three-tier (global / class / instance) LLM parameter injection before each chat request
Memory — Window, summary, ActiveRecord-backed, semantic, and composite conversation memory
Memory Compression — Automatic summarisation and tool-output pruning to stay within token limits
Context Management — Token budget calculation, estimation, and pruning for any model
Knowledge/RAG — Static, entity, and vector-backed retrieval sources with pluggable loaders, splitters, and vector stores
Multi-agent — Agent-as-Tool pattern (sub-agents wrapped as Tool::Base) and hub-and-spoke handoff routing via Agent::Runner
TrustPipeline — Citation tracking, self-review loop, and confidence gate for trustworthy outputs
Guardrails — Validate inputs and outputs before/after LLM calls; built-in PII and prompt-injection detectors
Output Parser — JSON and Struct-mapped parsers for structured LLM responses
Eval Framework — Dataset-driven evaluation with ExactMatch, Includes, and LLM-as-a-Judge scorers
Tracing — Pluggable span-based observability (NullTracer, LangfuseTracer, OpenTelemetryTracer)
StateStore — Persist graph state to memory, ActiveRecord, or Redis (with optional AES-256-GCM encryption)
MCP Tool — Integrate Model Context Protocol (MCP) servers as native tools
Rails integration — AgentJob, acts_as_phronomy_message mixin, and Rails generators

Installation

Add to your Gemfile:

gem "phronomy"

Then run:

bundle install

For Rails apps, run the install generator after bundling:

rails generate phronomy:install

This creates an initializer and the required database migrations.

Quick Start

Agent — ReAct tool-calling agent

class WebSearch < Phronomy::Tool::Base
  description "Search the web"
  param :query, type: :string, desc: "Search query"

  def execute(query:)
    # ... call a search API
  end
end

class ResearchAgent < Phronomy::Agent::Base
  model "gpt-4o"
  instructions "You are a research assistant. Use tools to answer questions."
  tools WebSearch
  max_iterations 5
end

result = ResearchAgent.new.invoke("What happened in AI research this week?")
puts result[:output]

Graph — Stateful workflow with interrupt/resume

class ReviewState
  include Phronomy::Graph::State
  field :draft,    type: :replace
  field :feedback, type: :replace
  field :approved, type: :replace, default: false
end

graph = Phronomy::Graph::StateGraph.new(ReviewState)
graph.add_node(:write)    { |s| { draft: Writer.call(s) } }
graph.add_node(:review)   { |s| { feedback: Reviewer.call(s.draft) } }
graph.add_node(:finalize) { |s| { approved: true } }
graph.add_edge(:write, :review)
graph.add_edge(:review, :finalize)
graph.set_entry_point(:write)

# Register an interrupt callback before the :finalize node
graph.interrupt_before(:finalize) do |state|
  puts "Draft ready for human review: #{state.draft}"
end

compiled = graph.compile
compiled.invoke({ draft: "" }, config: { thread_id: "doc-1" })

Multi-Agent — Agent-as-Tool pattern

Wrap sub-agents as Tool::Base subclasses so the orchestrator LLM can call them on demand.

class ResearchTool < Phronomy::Tool::Base
  description "Research a topic and return key findings as bullet points."
  param :topic, type: :string, desc: "The topic to research"

  def execute(topic:)
    ResearchAgent.new.invoke(topic)[:output]
  end
end

class WriteTool < Phronomy::Tool::Base
  description "Write a technical blog post given research notes and a writing brief."
  param :instructions, type: :string, desc: "Writing brief including research notes"

  def execute(instructions:)
    WriterAgent.new.invoke(instructions)[:output]
  end
end

class OrchestratorAgent < Phronomy::Agent::Base
  model "gpt-4o"
  instructions "Use the research tool first, then the write tool to produce a blog post."
  tools ResearchTool, WriteTool
end

result = OrchestratorAgent.new.invoke("Write a blog post about Ruby 3.4 features")
puts result[:output]

Guardrails — Input/output validation

class NoSensitiveDataGuardrail < Phronomy::Guardrail::InputGuardrail
  def check(input)
    fail!("Credit card numbers are not allowed") if input.match?(/\d{4}-\d{4}-\d{4}-\d{4}/)
  end
end

agent = ResearchAgent.new
agent.add_input_guardrail(NoSensitiveDataGuardrail.new)

Built-in Guardrails — PII and prompt injection detection

# Detect credit cards, SSNs, emails, and phone numbers automatically
agent.add_input_guardrail(Phronomy::Guardrail::Builtin::PIIPatternDetector.new)

# Block common prompt-injection attempts
agent.add_input_guardrail(Phronomy::Guardrail::Builtin::PromptInjectionDetector.new)

Knowledge/RAG — Context injection and vector retrieval

# Static knowledge (policy files, reference docs)
policy = Phronomy::KnowledgeSource::StaticKnowledge.new(
  File.read("policy.md"),
  type:   :policy,
  source: "policy.md"   # exposed to LLM for citation
)

# RAG retrieval from a vector store
store      = Phronomy::VectorStore::InMemory.new
embeddings = Phronomy::Embeddings::RubyLLMEmbeddings.new(model: "text-embedding-3-small")
rag = Phronomy::KnowledgeSource::RAGKnowledge.new(store: store, embeddings: embeddings, k: 5)

# Inject at invocation time
result = MyAgent.new.invoke("What is the refund policy?",
  config: { knowledge_sources: [policy, rag] })

Load and split documents with built-in loaders:

chunks = Phronomy::Loader::MarkdownLoader.new.load("docs/guide.md")
         .then { |docs| Phronomy::Splitter::RecursiveSplitter.new(chunk_size: 512).split(docs) }

Multi-Agent Handoff — Hub-and-spoke routing

triage  = TriageAgent.new
billing = BillingAgent.new
support = SupportAgent.new

runner = Phronomy::Agent::Runner.new(
  agents: [triage, billing, support],
  routes: { triage => [billing, support] }
)

result = runner.invoke("I need help with my invoice")
puts result[:output]           # final answer
puts result[:agent].class      # => BillingAgent

Before-Completion Hook — Dynamic LLM parameter injection

# Class-level: applies to all instances
class MyAgent < Phronomy::Agent::Base
  model "gpt-4o"
  before_completion ->(ctx) { { temperature: ctx.config[:precise] ? 0.0 : 0.7 } }
end

# Instance-level: overrides class hook for this agent only
agent = MyAgent.new
agent.before_completion = ->(ctx) { { max_tokens: 512 } }

# Global: applies to every agent across the app
Phronomy.configure do |c|
  c.before_completion = ->(ctx) { { temperature: 0.3 } }
end

Hooks are called in order — global → class → instance — and deep-merged.

TrustPipeline — Trustworthy outputs with citations and review

pipeline = Phronomy::TrustPipeline.new(
  draft_agent:          PolicyDraftAgent,
  review_agent:         PolicyReviewAgent,
  confidence_threshold: 0.7,
  max_iterations:       3
)

result = pipeline.invoke("What is the refund policy?")
puts result.output             # final answer
puts result.trusted?           # true when confidence >= 0.7
puts result.confidence         # Float 0.0–1.0

result.citations.each do |c|
  puts "#{c[:source]}: #{c[:excerpt]}"
end

Graph Parallel Node — Concurrent branches

class MyState
  include Phronomy::Graph::State
  field :summary, type: :replace
  field :tags,    type: :append,  default: -> { [] }
end

graph = Phronomy::Graph::StateGraph.new(MyState)

graph.add_parallel_node(
  :enrich,
  ->(s) { { summary: Summarizer.call(s) } },
  ->(s) { { tags:    Tagger.call(s) } },
  timeout:  10,
  on_error: :best_effort
)

graph.set_entry_point(:enrich)
graph.add_edge(:enrich, Phronomy::Graph::StateGraph::FINISH)
app = graph.compile
app.invoke({}, config: { thread_id: "t1" })

Output Parser — Structured LLM responses

# Extract JSON from LLM output (handles Markdown code fences automatically)
parser = Phronomy::OutputParser::JsonParser.new
data   = parser.parse('```json\n{"name":"Alice","score":0.9}\n```')
# => { name: "Alice", score: 0.9 }

# Map JSON directly to a Struct
PersonSchema = Struct.new(:name, :age, keyword_init: true)
parser = Phronomy::OutputParser::StructuredParser.new(PersonSchema)
person = parser.parse('{"name":"Alice","age":30}')
# => #<struct PersonSchema name="Alice", age=30>

Eval Framework — Dataset-driven quality evaluation

dataset = Phronomy::Eval::Dataset.from_array([
  { input: "Capital of France?", expected: "Paris" },
  { input: "Capital of Japan?",  expected: "Tokyo" }
])

agent   = MyGeographyAgent.new
runner  = Phronomy::Eval::Runner.new(
  scorer: Phronomy::Eval::Scorer::LlmJudge.new(model: "gpt-4o-mini")
)

results = runner.run(dataset, ->(q) { agent.invoke(q) })
metrics = Phronomy::Eval::Metrics.new(results)

puts "Mean score: #{metrics.mean_score}"   # Float 0.0–1.0
puts "Pass rate:  #{metrics.pass_rate}"    # fraction with score >= threshold

Tracing — Custom observability

Phronomy.configure do |c|
  c.tracer = MyCustomTracer.new  # any Phronomy::Tracing::Base subclass
end

MCP Tool — External tool servers

search_tool = Phronomy::Tool::McpTool.from_server(
  "stdio://./mcp-server",
  tool_name: "web_search"
)

Rails — ActiveRecord persistence

# In your migration (generated by rails generate phronomy:install):
# create_table :phronomy_messages ...
# create_table :phronomy_states ...

class PhronomyMessage < ApplicationRecord
  acts_as_phronomy_message
end

# config/initializers/phronomy.rb
Phronomy.configure do |c|
  c.default_state_store = Phronomy::StateStore::ActiveRecord.new(
    model_class: PhronomyState  # AR model backed by phronomy_states table
  )
end

# Use in a controller:
agent = ResearchAgent.new
result = agent.invoke(
  params[:message],
  config: {
    thread_id: "user_#{current_user.id}",
    memory:    PhronomyMessage.phronomy_memory
  }
)

Configuration

Phronomy.configure do |c|
  c.default_model       = "gpt-4o-mini"
  c.recursion_limit     = 25
  c.tracer              = Phronomy::Tracing::NullTracer.new
  c.default_state_store = Phronomy::StateStore::InMemory.new  # optional
  c.memory_compression  = []                                   # optional; Array of compressors
  c.before_completion   = nil                                  # optional; global hook lambda
end

Context Management

Phronomy includes a context window management layer so agents automatically stay within the token limits of the underlying model.

TokenBudget

Derives the effective token budget from RubyLLM's model registry:

budget = Phronomy::Context::TokenBudget.new(
  model:    "claude-3-5-sonnet-20241022",  # looks up context_window + max_output_tokens
  overhead: 500                            # extra reservation for tool definitions
)
budget.context_window       # => 200_000
budget.max_output_tokens    # => 8_192
budget.effective_input_limit # => 191_308

Or supply explicit values (useful for local / unregistered models):

budget = Phronomy::Context::TokenBudget.new(
  context_window:    32_768,
  max_output_tokens: 4_096
)

Budget-aware Memory

Pass a budget to load_messages and only the newest messages that fit are returned:

memory = Phronomy::Memory::WindowMemory.new
messages = memory.load_messages(thread_id: "t1", token_budget: budget)

ActiveRecordMemory also accepts pruner: to truncate oversized tool results:

memory = Phronomy::Memory::ActiveRecordMemory.new(
  model_class: PhronomyMessage,
  pruner: Phronomy::Memory::Compression::ToolOutputPruner.new(max_chars: 4000)
)

Agent DSL extensions

class MyAgent < Phronomy::Agent::Base
  model "gpt-4o"
  max_output_tokens 4096   # override max_output_tokens from registry
  context_overhead  600    # extra reservation for system prompt + tools
end

Agent::Base#invoke builds a TokenBudget automatically and passes it to memory.load_messages. When the model is not in the registry the budget is silently skipped.

SemanticMemory

Embedding-based retrieval of relevant past messages:

semantic = Phronomy::Memory::SemanticMemory.new(
  embedding_model: "text-embedding-3-small",
  k: 10
)
messages = semantic.load_messages(thread_id: "t1", query: "user's current question")

Composite retrieval

Merge multiple retrieval strategies within a shared ConversationManager:

composite_retrieval = Phronomy::Memory::Retrieval::Composite.new(
  sources: [
    { retrieval: Phronomy::Memory::Retrieval::Recent.new(k: 5),    weight: 0.4 },
    { retrieval: Phronomy::Memory::Retrieval::Semantic.new(k: 10), weight: 0.6 }
  ]
)

manager = Phronomy::Memory::ConversationManager.new(
  storage:   Phronomy::Memory::Storage::InMemory.new,
  retrieval: composite_retrieval
)

Memory Compression

Automatically shrink conversation history before it reaches the LLM.

# Truncate oversized tool outputs (no LLM call, cheap)
pruner = Phronomy::Memory::Compression::ToolOutputPruner.new(max_chars: 4000)

# Summarise old messages when history exceeds max_tokens (calls summarizer_model)
summary = Phronomy::Memory::Compression::Summary.new(
  max_tokens:       4000,
  keep:             10,             # always preserve the N most recent messages
  summarizer_model: "gpt-4o-mini"
)

Phronomy.configure do |c|
  c.memory_compression = [pruner, summary]   # applied in order: pruner first, then summary
end

Examples

Runnable examples covering all major features are available in the phronomy-examples repository.

Each example lives in its own numbered directory and can be run with:

bundle exec ruby NN_example_name/run.rb

#	Directory	What it demonstrates
01	`01_basic_chain/`	PromptTemplate → LLMChain pipeline
02	`02_react_agent/`	ReAct tool-calling agent
03	`03_state_graph/`	Stateful graph with interrupt/resume
04	`04_interrupt_resume/`	Human-in-the-loop interrupt and resume
05	`05_multi_agent/`	Multi-agent coordination via Agent-as-Tool
06	`06_guardrails/`	Input/output guardrails
07	`07_tracing/`	Custom observability with Langfuse tracer
08	`08_mcp_tool/`	MCP tool integration
09	`09_rails_chat/`	Rails chat app with ActionCable streaming
10	`10_context_management/`	Token budget and context pruning
11	`11_agent_streaming/`	Streaming agent responses
12	`12_prompt_template/`	Advanced prompt templates
13	`13_mcp_http_tool/`	HTTP-based MCP tool server
14	`14_code_review/`	Automated code review agent
15	`15_rails_secure_chat/`	Rails chat with PII guardrails and secure memory
16	`16_before_completion_hook/`	Global/class/instance before_completion hooks
17	`17_multi_agent_handoff/`	Hub-and-spoke agent routing via Runner
18	`18_rails_agent_job/`	Rails app with AgentJob + ActionCable streaming
19	`19_trust_pipeline/`	Trustworthy output via Citation Tracking + Self-Review + Confidence Gate

Development

After checking out the repo, install dependencies:

bin/setup

Run the unit test suite:

bundle exec rspec spec/phronomy

Run the integration tests (requires a running LLM endpoint):

bundle exec rspec spec/integration --tag integration

Launch an interactive console:

bin/console

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/Raizo-TCS/phronomy.

License

The gem is available as open source under the terms of the MIT License.