Whoosh
The fastest way to ship a production MCP server in Ruby.
A FastAPI-style framework with MCP, schema validation, auth, streaming, and OpenAPI — built in.
Why Whoosh?
- MCP-native — opt routes into MCP with
mcp: trueand they become typed tools over stdio/SSE. No glue code, no separate server. - FastAPI-style DSL in Ruby — declarative schemas, typed request/response, auto-generated OpenAPI + Swagger UI, dependency injection.
- Batteries included — auth (JWT, API key, OAuth), rate limiting, caching, background jobs, file uploads, vector search, streaming, pagination.
- Agent-friendly —
whoosh describeemits a JSON snapshot of your app; generatedCLAUDE.mdso coding agents understand it;whoosh checkvalidates config before runtime. - Competitive Ruby performance — YJIT + Falcon fibers + Oj, ~2.5µs framework overhead. See Performance for honest, per-core comparisons.
When NOT to use Whoosh
Whoosh is on 1.x but still evolving — solo-maintained, without a production track record yet, and breaking changes ship occasionally (always called out in CHANGELOG.md). Reach for something else when:
- You need a managed backend. Supabase, PocketBase, or Firebase give you DB + auth + realtime without hosting a framework. Whoosh is the app layer — use it with a managed DB if that fits.
- You want maximum ecosystem depth. Rails has more gems; FastAPI has the Python ML/AI library ecosystem (PyTorch, transformers, LangChain). If your core workload lives in those libraries, stay where they are.
- You need a frozen API surface. Being on 1.x doesn't mean the API is locked — breaking changes still ship when the design calls for it. If you need strict stability contracts today, wait a few releases.
- Your team has no Ruby experience and the project isn't specifically about AI/MCP. Hiring and ecosystem gravity usually beat framework features.
Whoosh's sweet spot: Ruby shops (or Ruby-curious teams) building AI / LLM / MCP-backed APIs who want typed schemas, OpenAPI, and MCP without wiring three libraries together.
Install
gem install whoosh
whoosh new my_api
cd my_api
whoosh s
Open http://localhost:9292/docs for Swagger UI.
Quick Start
# app.rb
require "whoosh"
app = Whoosh::App.new
app.get "/health" do
{ status: "ok", version: Whoosh::VERSION }
end
app.post "/chat", request: ChatRequest, mcp: true do |req|
stream_llm do |out|
llm.chat(req.body[:message]).each_chunk { |c| out << c }
out.finish
end
end
whoosh s # Start server
whoosh s --reload # Auto-reload on file changes
whoosh s -p 3000 # Custom port
Features
Routing
# Inline
app.get("/users/:id") { |req| { id: req.params[:id] } }
# Class-based
class ChatEndpoint < Whoosh::Endpoint
post "/chat", request: ChatRequest, mcp: true
def call(req)
{ reply: "Hello!" }
end
end
app.load_endpoints("endpoints/")
# Groups with shared middleware
app.group "/api/v1", mcp: true do
get("/status") { { ok: true } }
post("/analyze", auth: :api_key) { |req| analyze(req) }
end
Schema Validation
class CreateUserRequest < Whoosh::Schema
field :name, String, required: true, desc: "User name"
field :email, String, required: true, desc: "Email address"
field :age, Integer, min: 0, max: 150
field :role, String, default: "user"
end
# Returns 422 with field-level errors on invalid input
app.post "/users", request: CreateUserRequest do |req|
{ name: req.body[:name], created: true }
end
Authentication & Security
app.auth do
api_key header: "X-Api-Key", keys: {
"sk-prod-123" => { role: :premium },
"sk-free-456" => { role: :free }
}
jwt secret: ENV["JWT_SECRET"], algorithm: :hs256
end
app.rate_limit do
default limit: 60, period: 60
rule "/chat", limit: 10, period: 60
tier :free, limit: 100, period: 3600
tier :premium, limit: 5000, period: 3600
on_store_failure :fail_open
end
app.access_control do
role :free, models: ["claude-haiku"]
role :premium, models: ["claude-haiku", "claude-sonnet", "claude-opus"]
end
AI / LLM Integration
# Chat with any LLM (auto-detects ruby_llm gem)
app.post "/chat" do |req, llm:|
{ reply: llm.chat(req.body["message"]) }
end
# Structured output — LLM returns validated JSON
app.post "/extract" do |req, llm:|
data = llm.extract(req.body["text"], schema: InvoiceSchema)
{ invoice: data }
end
# RAG in 3 lines
app.post "/ask" do |req, vectors:, llm:|
context = vectors.search("knowledge", vector: (req.body["q"]), limit: 5)
{ answer: llm.chat(req.body["q"], system: "Context: #{context}") }
end
Vector Search
app.post "/index" do |req, vectors:|
vectors.insert("docs", id: req.body["id"],
vector: req.body["embedding"],
metadata: { title: req.body["title"] })
{ indexed: true }
end
app.post "/search" do |req, vectors:|
results = vectors.search("docs", vector: req.body["embedding"], limit: 10)
{ results: results }
end
In-memory by default (cosine similarity). Install zvec gem for production-grade HNSW index.
LLM Streaming (OpenAI-compatible)
app.post "/chat/stream", auth: :api_key do |req|
stream_llm do |out|
# True chunked streaming via SizedQueue — tokens flow in real-time
out << "Hello "
out << "World!"
out.finish # sends data: [DONE]
end
end
# SSE events
app.get "/events" do
stream :sse do |out|
out.event("status", { connected: true })
out << { data: "hello" }
end
end
MCP (Model Context Protocol)
Routes are exposed as MCP tools only when you opt in with mcp: true. This prevents internal or admin endpoints from being callable as tools by accident.
# Opt in per route:
app.post "/summarize", request: SummarizeRequest, mcp: true do |req|
{ summary: llm.summarize(req.body[:text]) }
end
# Or opt in a whole group:
app.group "/tools", mcp: true do
post "/translate" do |req|
{ result: translate(req.body["text"]) }
end
end
# Default: not exposed as an MCP tool.
app.get "/internal" do
{ debug: "not exposed" }
end
whoosh mcp # stdio transport (Claude Desktop, Cursor)
whoosh mcp --list # list registered MCP tools
Background Jobs
class AnalyzeJob < Whoosh::Job
inject :db, :llm # DI injection
def perform(document_id:)
doc = db[:documents].where(id: document_id).first
result = llm.complete("Analyze: #{doc[:text]}")
db[:documents].where(id: document_id).update(analysis: result)
{ analyzed: true }
end
end
# Fire and forget
app.post "/analyze" do |req|
job_id = AnalyzeJob.perform_async(document_id: req.body["id"])
{ job_id: job_id }
end
# Check status
app.get "/jobs/:id" do |req|
job = Whoosh::Jobs.find(req.params[:id])
{ status: job[:status], result: job[:result] }
end
whoosh worker # dedicated worker process
whoosh worker -c 4 # 4 threads
File Upload
app.post "/upload" do |req|
file = req.files["document"]
file.filename # => "report.pdf"
file.content_type # => "application/pdf"
file.size # => 245760
file.read_text # => UTF-8 string (for RAG)
file.to_base64 # => base64 (for vision APIs)
file.validate!(types: ["application/pdf"], max_size: 10_000_000)
path = file.save("documents")
{ path: path }
end
Cache
app.get "/users/:id" do |req, cache:|
cache.fetch("user:#{req.params[:id]}", ttl: 60) do
db[:users].where(id: req.params[:id]).first
end
end
Pagination
# Offset-based
app.get "/users" do |req|
paginate(db[:users].order(:id),
page: req.query_params["page"], per_page: 20)
end
# Cursor-based (recommended for large datasets)
app.get "/messages" do |req|
paginate_cursor(db[:messages].order(:id),
cursor: req.query_params["cursor"], limit: 20)
end
Plugins (18 AI Gems Auto-Discovered)
# Just add gems to Gemfile — they're auto-discovered from Gemfile.lock
gem "ruby_llm"
gem "lingua-ruby"
gem "ner-ruby"
gem "guardrails-ruby"
# Available as bare method calls in endpoints:
app.post "/analyze" do |req|
lang = lingua.detect(req.body["text"])
entities = ner.recognize(req.body["text"])
{ language: lang, entities: entities }
end
HTTP Client
app.post "/proxy" do |req, http:|
result = http.post("https://api.example.com/analyze",
json: req.body,
headers: { "Authorization" => "Bearer #{ENV["API_KEY"]}" },
timeout: 30
)
result.json # parsed response
end
Prometheus Metrics
Auto-tracked at /metrics:
whoosh_requests_total{method="GET",path="/health",status="200"} 1234
whoosh_request_duration_seconds_sum{path="/health"} 45.23
whoosh_request_duration_seconds_count{path="/health"} 1234
OpenAPI & Docs
app.openapi do
title "My AI API"
version "1.0.0"
end
app.docs enabled: true, redoc: true
/docs— Swagger UI/redoc— ReDoc/openapi.json— Machine-readable spec
Client Generator
Generate complete, typed, ready-to-run client apps from your Whoosh API — one command.
whoosh generate client react_spa # React + Vite + TypeScript
whoosh generate client expo # Expo + React Native
whoosh generate client ios # SwiftUI + MVVM
whoosh generate client flutter # Dart + Riverpod + GoRouter
whoosh generate client htmx # Plain HTML + htmx, no build step
whoosh generate client telegram_bot # Ruby Telegram bot
whoosh generate client telegram_mini_app # React + Telegram WebApp SDK
whoosh generate client react_spa --oauth # Add Google/GitHub/Apple login
The generator introspects your Whoosh app via OpenAPI — it reads your routes, schemas, and auth config, then produces a typed client with:
- API client with auth headers and automatic token refresh
- Model types matching your schemas
- Auth screens (login, register, logout)
- CRUD screens for every resource
- Navigation and routing
- Starter tests
If no Whoosh app exists yet, it scaffolds a standard backend (JWT auth + tasks CRUD) alongside the client.
| Client | Stack | Token Storage |
|---|---|---|
react_spa |
React 19, Vite, TypeScript, React Router | localStorage |
expo |
Expo SDK 52, Expo Router, TypeScript | SecureStore |
ios |
SwiftUI, async/await, MVVM | Keychain |
flutter |
Dart, Dio, Riverpod, GoRouter | flutter_secure_storage |
htmx |
HTML, htmx 2.x, vanilla JS | localStorage |
telegram_bot |
Ruby, telegram-bot-ruby | In-memory session |
telegram_mini_app |
React, Telegram WebApp SDK | Telegram initData |
Health Checks
app.health_check do
probe(:database) { db.test_connection }
probe(:cache) { cache.get("ping") || true }
end
# GET /healthz → { "status": "ok", "checks": { "database": "ok" } }
CLI
whoosh new my_api # scaffold project (with Dockerfile)
whoosh s # start server (like rails s)
whoosh s --reload # hot reload on file changes
whoosh routes # list all routes
whoosh describe # dump app as JSON (AI-friendly)
whoosh check # validate config, catch mistakes
whoosh console # IRB with app loaded
whoosh ci # lint + security + audit + tests + coverage
whoosh worker # background job worker
whoosh mcp # MCP stdio server
whoosh mcp --list # list all MCP tools
whoosh generate endpoint chat # endpoint + schema + test
whoosh generate schema User # schema file
whoosh generate model User name:string email:string
whoosh generate migration add_email_to_users
whoosh generate plugin my_tool # plugin boilerplate
whoosh generate proto ChatRequest # .proto file
whoosh generate client react_spa # full client app (7 types)
whoosh generate client expo --oauth # with OAuth2 social login
whoosh db migrate # run migrations
whoosh db rollback # rollback
whoosh db status # migration status
AI Agent DX
Every whoosh new project includes a CLAUDE.md with all framework patterns, commands, and conventions — so AI agents (Claude Code, Cursor, Copilot) can build with Whoosh immediately.
# Dump your entire app structure as JSON (routes, schemas, config, MCP tools)
whoosh describe
# AI tools can consume this to understand your API
whoosh describe --routes # routes with request/response schemas
whoosh describe --schemas # all schema definitions
# Catch mistakes before runtime
whoosh check # validates config, auth, dependencies
Performance
Apple Silicon arm64, 12 cores. Ruby 3.4 + YJIT. Full benchmark suite & reproduction steps
How to read these numbers. Benchmarks are selective by nature. A
GET /healthreturning{"status":"ok"}tests the router + serializer, not your real app. A Postgres read tests one query pattern. We show single-process (per-core) numbers first because that's the fair cross-language comparison. Multi-worker numbers are included for deployment sizing, but scaling strategies differ per runtime (Node usescluster, Python uses multiple workers, Ruby uses workers × threads or fibers) and mixing them isn't apples-to-apples.
HTTP micro-benchmark — GET /health
Single process (per-core, fair comparison):
| Framework | Language | Server | Req/sec |
|---|---|---|---|
| Fastify | Node.js 22 | built-in | 69,200 |
| Whoosh | Ruby 3.4 +YJIT | Falcon | 24,400 |
| Whoosh | Ruby 3.4 +YJIT | Puma (5 threads) | 15,500 |
| FastAPI | Python 3.13 | uvicorn | 8,900 |
| Sinatra | Ruby 3.4 | Puma (5 threads) | 7,100 |
On this microbenchmark, Fastify is ~2.8× Whoosh+Falcon per-core; that's the honest picture for trivial JSON. Against other Ruby frameworks and against FastAPI on CPython, Whoosh is competitive.
Multi-worker (sizing reference, not apples-to-apples):
| Framework | Server | Req/sec |
|---|---|---|
| Whoosh | Falcon (4 workers) | 87,400 |
| Fastify | built-in (single thread, no cluster) | 69,200 |
| Whoosh | Puma (4w × 4t) | 52,500 |
| Roda | Puma (4w × 4t) | 14,700 |
Fastify was not run under cluster; don't read this table as "Whoosh beats Fastify." Read it as "Whoosh on 4 cores handles ~87K req/s on trivial JSON."
Real-world benchmark — GET /users/:id from PostgreSQL (1000-row table)
Single process (per-core):
| Framework | Req/sec |
|---|---|
| Fastify + pg | 36,900 |
| Whoosh + Falcon (fiber PG pool) | 13,400 |
| Whoosh + Puma (Sequel) | 8,600 |
| Roda + Puma | 6,700 |
| Sinatra + Puma | 4,400 |
| FastAPI + uvicorn | 2,400 |
On realistic DB-bound work, Whoosh's fiber-aware PG pool closes a lot of the gap vs Fastify (~2.75×) and has a wide lead over FastAPI on CPython and over other Ruby frameworks.
Multi-worker (sizing reference):
| Framework | Req/sec |
|---|---|
| Whoosh + Falcon (4 workers, fiber PG pool) | 45,900 |
| Fastify (single thread) | 36,900 |
Micro-benchmarks
| Component | Throughput |
|---|---|
| Router lookup (static, cached) | 6.1M ops/s |
| JSON encode (Oj) | 5.4M ops/s |
| Framework overhead | ~2.5µs per request |
Optimizations: YJIT auto-enabled, Oj JSON auto-detected, O(1) static route cache, compiled middleware chain, pre-frozen headers.
Configuration
# config/app.yml
app:
name: My API
port: 9292
database:
url: <%= ENV.fetch("DATABASE_URL", "sqlite://db/dev.sqlite3") %>
max_connections: 10
cache:
store: memory # memory | redis
default_ttl: 300
jobs:
backend: memory # memory | database | redis
workers: 2
logging:
level: info
format: json
docs:
enabled: true
.env files loaded automatically (dotenv-compatible).
Testing
require "whoosh/test"
RSpec.describe "My API" do
include Whoosh::Test
def app = MyApp.to_rack
it "creates a user" do
post_json "/users", { name: "Alice", email: "a@b.com" }
assert_response 200
assert_json(name: "Alice")
end
it "requires auth" do
get "/protected"
assert_response 401
end
it "works with auth" do
get_with_auth "/protected", key: "sk-test"
assert_response 200
end
end
License
MIT — see LICENSE.
Contributing
See CONTRIBUTING.md.