PromptCanary

CI Gem Version

Canary deployments for LLM prompts in Ruby. Declare prompts as versioned Ruby classes, route traffic by percentage or predicate, record telemetry, and automatically roll back misbehaving versions when error rate or latency exceeds a configured threshold.

Design philosophy

PromptCanary's value is in routing, telemetry, and rollback — not in where your prompt text lives. You can declare versions directly in Ruby classes, load them from database records at boot, or both. Either way the gem handles traffic splitting, call recording, and automatic demotion identically.

Installation

bundle add prompt_canary

Or add to your Gemfile:

gem "prompt_canary"

Rails Setup

Run the install generator:

rails generate prompt_canary:install

This creates a single migration that sets up all four tables (prompt_canary_calls, prompt_canary_rollout_overrides, prompt_canary_primary_overrides, prompt_canary_events) and mounts the engine in config/routes.rb. Then run:

rails db:migrate

Add the adapter gem to your Gemfile — PromptCanary does not pull it in automatically:

gem "anthropic"  # required when using adapter: :anthropic

Configure in config/initializers/prompt_canary.rb:

PromptCanary.configure do |c|
  c.adapter = :anthropic      # required
  c.api_key = ENV["ANTHROPIC_API_KEY"]
  c.storage = :active_record  # recommended for Rails
end

Configuration

PromptCanary.configure do |c|
  c.adapter = :anthropic  # required
  c.storage = :sqlite     # :sqlite, :active_record, or :memory (tests)
end

Both adapter and storage are required. ConfigurationError is raised immediately if either is missing or unknown — not at first call.

Use :active_record in Rails apps, :sqlite for standalone scripts, and :memory in tests.

Defining a Prompt

Include PromptCanary::Promptable in any class and declare versions using the DSL:

class InvoiceExtractor
  include PromptCanary::Promptable

  version "v1" do
    model  "claude-opus-4-7"
    system "Extract structured data from this invoice."
  end
end

The first declared version is automatically treated as primary — no flag needed. Place prompt classes in app/prompts/ — the Railtie loads them automatically on boot.

Loading Prompts from the Database

Declaring versions in the class body and loading them from database records are equally supported patterns. The DSL works the same either way — it is registering Version objects regardless of where the data comes from.

To load from the DB, call the DSL in an initializer after the connection is established:

# config/initializers/prompt_canary.rb
PromptCanary.configure do |c|
  c.adapter = :anthropic
  c.storage = :active_record
end

PromptRecord.all.each do |record|
  klass = record.prompt_class.constantize
  klass.version(record.version_name) do
    model  record.model
    system record.system_prompt
  end
end

Choose whichever approach fits your team's operational needs. The gem has no opinion on where prompt text lives.

Calling a Prompt

result = InvoiceExtractor.call(user_message: "Invoice #123")

result.text          # => "Here is the extracted data..."
result.version_used  # => "v1"
result.model         # => "claude-opus-4-7"
result.latency_ms    # => 312
result.tokens        # => { input: 50, output: 120 }
result.error         # => nil (or the exception if the adapter failed)

call always returns a Result — errors are captured in result.error, not raised.

Routing Traffic

Percentage rollout

class InvoiceExtractor
  include PromptCanary::Promptable

  version "v1" do
    model  "claude-opus-4-7"
    system "Extract structured data from this invoice."
  end

  version "v2" do
    model  "claude-opus-4-7"
    system "Extract structured data. Return JSON."
    rollout percent: 10
  end
end

Pass a call_id in context for deterministic routing — the same call_id always produces the same version:

InvoiceExtractor.call(user_message: "Invoice #123", context: { call_id: current_user.id })

Predicate rollout

Route to a version based on any condition:

version "v2" do
  model  "claude-opus-4-7"
  system "Extract structured data. Return JSON."
  rollout_to { |ctx| ctx[:user]&.fetch(:beta, false) }
end
InvoiceExtractor.call(
  user_message: "Invoice #123",
  context: { user: { id: 42, beta: true } }
)
# => routes to v2 for beta users

If the predicate raises, the router falls back to the primary version — always safe.

Canary Traffic

Traffic percentage and version status are independent controls. rollout percent: sets the initial split at declaration time. set_canary adjusts it at runtime without a deploy:

PromptCanary.set_canary(InvoiceExtractor, "v2", 20)  # send 20% to v2
PromptCanary.set_canary(InvoiceExtractor, "v2", 50)  # ramp up to 50%
PromptCanary.set_canary(InvoiceExtractor, "v2", 80)  # almost there

Passing 0 to set_canary is not allowed — use demote to stop traffic. This keeps the audit trail unambiguous.

Typical canary ramp

v1: primary,   100% traffic
v2: candidate,  20% traffic  ← set_canary(InvoiceExtractor, "v2", 20)
                              ← watch telemetry
v2: candidate,  50% traffic  ← set_canary(InvoiceExtractor, "v2", 50)
                              ← looks good
v2: primary,    50% traffic  ← promote(InvoiceExtractor, "v2")
v2: primary,   100% traffic  ← demote(InvoiceExtractor, "v1")

Promoting v2 changes its status only — it does not flip traffic to 100%. The traffic ramp is a separate deliberate step.

Promote and Demote

Status (primary / candidate / demoted) and traffic percentage are separate, independent concepts.

  • promote — marks a version as primary. The previous primary becomes a candidate. Traffic percentages are not changed.
  • demote — sets status to demoted and zeros traffic immediately. This is the emergency brake.
  • restore — returns a demoted version to candidate status and restores its pre-demotion traffic percentage.
PromptCanary.promote(InvoiceExtractor, "v2")
PromptCanary.promote(InvoiceExtractor, "v2", reason: "canary passed")

PromptCanary.demote(InvoiceExtractor, "v2")
PromptCanary.demote(InvoiceExtractor, "v2", reason: "error rate spike")

PromptCanary.restore(InvoiceExtractor, "v2")

Attempting to demote the primary version when no other viable candidate exists raises CannotDemotePrimaryError — the system is never left without a route target.

All status operations write an audit event to prompt_canary_events.

Auto-Rollback

Define rollback rules on a version:

version "v2" do
  model  "claude-opus-4-7"
  system "Extract structured data. Return JSON."
  rollout percent: 10
  rollback_if :error_rate,  greater_than: 0.05, over: 100
  rollback_if :latency_p95, greater_than: 2000, over: 100
end

In Rails

PromptCanary::MonitorJob is included and ready to queue:

# config/initializers/prompt_canary.rb or a scheduler
PromptCanary::MonitorJob.set(wait: 5.minutes).perform_later

Schedule it with any background job backend (Sidekiq, GoodJob, Solid Queue, etc.).

Standalone

recorder = PromptCanary::Recorder.new(storage: PromptCanary::Storage::SQLite.new)
PromptCanary::Monitor.new(recorder: recorder).evaluate(InvoiceExtractor)

When a rule fires, PromptCanary.demote is called automatically — the version's rollout is zeroed, a prompt_canary.demoted notification is emitted, and a monitor-triggered audit event is written.

CLI

# Promote a version to primary
prompt_canary promote InvoiceExtractor v2
prompt_canary promote InvoiceExtractor v2 --reason "canary passed"

# Demote a version (emergency stop)
prompt_canary demote InvoiceExtractor v2 --reason "error rate spike"

# Show current status and traffic for all versions
prompt_canary status InvoiceExtractor

# Show deployment history
prompt_canary history InvoiceExtractor
prompt_canary history InvoiceExtractor --since 7d

Dashboard

The engine mounts a web dashboard at the path configured in your routes (default /prompt_canary):

  • Index — all registered prompt classes with per-version call counts, error rates, P95 latency, and last-called timestamps
  • Show — version breakdown with a Promote button for candidate versions, deployment history from the audit trail, and the 50 most recent calls with per-call latency, token counts, and error detail

No authentication is wired in by default. Protect the mount point with your app's existing auth if needed:

authenticate :user, ->(u) { u.admin? } do
  mount PromptCanary::Engine, at: "/prompt_canary"
end

Notifications

Subscribe to deployment events:

PromptCanary.subscribe("prompt_canary.promoted") do |payload|
  puts "#{payload[:prompt]} #{payload[:version]} promoted"
end

PromptCanary.subscribe("prompt_canary.demoted") do |payload|
  puts "#{payload[:prompt]} #{payload[:version]} demoted — #{payload[:reason]}"
end

PromptCanary.subscribe("prompt_canary.restored") do |payload|
  puts "#{payload[:prompt]} #{payload[:version]} restored"
end

Examples

Two runnable scripts are included in examples/. Both use a stubbed adapter and require no API key:

# Full call flow — routing, result structure, version distribution
bundle exec ruby examples/demo.rb

# Auto-rollback demo — seeds synthetic errors, runs monitor, watches demotion fire
bundle exec ruby examples/auto_rollback.rb

Pass --real to demo.rb to hit the Anthropic API directly (requires ANTHROPIC_API_KEY):

ANTHROPIC_API_KEY=sk-... bundle exec ruby examples/demo.rb --real

Development

bin/setup                                    # install dependencies
bundle exec rake                             # run tests + lint
bundle exec rspec spec/foo_spec.rb:42        # run a single example
bin/console                                  # interactive prompt with gem loaded

Contributing

See CONTRIBUTING.md.

License

MIT. See LICENSE.txt.