Whoosh

The fastest way to ship a production MCP server in Ruby.
A FastAPI-style framework with MCP, schema validation, auth, streaming, and OpenAPI — built in.

Why Whoosh?

MCP-native — opt routes into MCP with mcp: true and they become typed tools over stdio/SSE. No glue code, no separate server.
FastAPI-style DSL in Ruby — declarative schemas, typed request/response, auto-generated OpenAPI + Swagger UI, dependency injection.
Batteries included — auth (JWT, API key, OAuth), rate limiting, caching, background jobs, file uploads, vector search, streaming, pagination.
Agent-friendly — whoosh describe emits a JSON snapshot of your app; generated CLAUDE.md so coding agents understand it; whoosh check validates config before runtime.
Competitive Ruby performance — YJIT + Falcon fibers + Oj, ~2.5µs framework overhead. See Performance for honest, per-core comparisons.

When NOT to use Whoosh

Whoosh is on 1.x but still evolving — solo-maintained, without a production track record yet, and breaking changes ship occasionally (always called out in CHANGELOG.md). Reach for something else when:

You need a managed backend. Supabase, PocketBase, or Firebase give you DB + auth + realtime without hosting a framework. Whoosh is the app layer — use it with a managed DB if that fits.
You want maximum ecosystem depth. Rails has more gems; FastAPI has the Python ML/AI library ecosystem (PyTorch, transformers, LangChain). If your core workload lives in those libraries, stay where they are.
You need a frozen API surface. Being on 1.x doesn't mean the API is locked — breaking changes still ship when the design calls for it. If you need strict stability contracts today, wait a few releases.
Your team has no Ruby experience and the project isn't specifically about AI/MCP. Hiring and ecosystem gravity usually beat framework features.

Whoosh's sweet spot: Ruby shops (or Ruby-curious teams) building AI / LLM / MCP-backed APIs who want typed schemas, OpenAPI, and MCP without wiring three libraries together.

Install

gem install whoosh
whoosh new my_api
cd my_api
whoosh s

Open http://localhost:9292/docs for Swagger UI.

Quick Start

# app.rb
require "whoosh"

app = Whoosh::App.new

app.get "/health" do
  { status: "ok", version: Whoosh::VERSION }
end

app.post "/chat", request: ChatRequest, mcp: true do |req|
  stream_llm do |out|
    llm.chat(req.body[:message]).each_chunk { |c| out << c }
    out.finish
  end
end

whoosh s              # Start server
whoosh s --reload     # Auto-reload on file changes
whoosh s -p 3000      # Custom port

Features

Routing

# Inline
app.get("/users/:id") { |req| { id: req.params[:id] } }

# Class-based
class ChatEndpoint < Whoosh::Endpoint
  post "/chat", request: ChatRequest, mcp: true

  def call(req)
    { reply: "Hello!" }
  end
end
app.load_endpoints("endpoints/")

# Groups with shared middleware
app.group "/api/v1", mcp: true do
  get("/status") { { ok: true } }
  post("/analyze", auth: :api_key) { |req| analyze(req) }
end

Schema Validation

class CreateUserRequest < Whoosh::Schema
  field :name,  String,  required: true, desc: "User name"
  field :email, String,  required: true, desc: "Email address"
  field :age,   Integer, min: 0, max: 150
  field :role,  String,  default: "user"
end

# Returns 422 with field-level errors on invalid input
app.post "/users", request: CreateUserRequest do |req|
  { name: req.body[:name], created: true }
end

Authentication & Security

app.auth do
  api_key header: "X-Api-Key", keys: {
    "sk-prod-123" => { role: :premium },
    "sk-free-456" => { role: :free }
  }
  jwt secret: ENV["JWT_SECRET"], algorithm: :hs256
end

app.rate_limit do
  default limit: 60, period: 60
  rule "/chat", limit: 10, period: 60
  tier :free,    limit: 100,  period: 3600
  tier :premium, limit: 5000, period: 3600
  on_store_failure :fail_open
end

app.access_control do
  role :free,    models: ["claude-haiku"]
  role :premium, models: ["claude-haiku", "claude-sonnet", "claude-opus"]
end

AI / LLM Integration

# Chat with any LLM (auto-detects ruby_llm gem)
app.post "/chat" do |req, llm:|
  { reply: llm.chat(req.body["message"]) }
end

# Structured output — LLM returns validated JSON
app.post "/extract" do |req, llm:|
  data = llm.extract(req.body["text"], schema: InvoiceSchema)
  { invoice: data }
end

# RAG in 3 lines
app.post "/ask" do |req, vectors:, llm:|
  context = vectors.search("knowledge", vector: embed(req.body["q"]), limit: 5)
  { answer: llm.chat(req.body["q"], system: "Context: #{context}") }
end

Vector Search

app.post "/index" do |req, vectors:|
  vectors.insert("docs", id: req.body["id"],
    vector: req.body["embedding"],
    metadata: { title: req.body["title"] })
  { indexed: true }
end

app.post "/search" do |req, vectors:|
  results = vectors.search("docs", vector: req.body["embedding"], limit: 10)
  { results: results }
end

In-memory by default (cosine similarity). Install zvec gem for production-grade HNSW index.

LLM Streaming (OpenAI-compatible)

app.post "/chat/stream", auth: :api_key do |req|
  stream_llm do |out|
    # True chunked streaming via SizedQueue — tokens flow in real-time
    out << "Hello "
    out << "World!"
    out.finish  # sends data: [DONE]
  end
end

# SSE events
app.get "/events" do
  stream :sse do |out|
    out.event("status", { connected: true })
    out << { data: "hello" }
  end
end

MCP (Model Context Protocol)

Routes are exposed as MCP tools only when you opt in with mcp: true. This prevents internal or admin endpoints from being callable as tools by accident.

# Opt in per route:
app.post "/summarize", request: SummarizeRequest, mcp: true do |req|
  { summary: llm.summarize(req.body[:text]) }
end

# Or opt in a whole group:
app.group "/tools", mcp: true do
  post "/translate" do |req|
    { result: translate(req.body["text"]) }
  end
end

# Default: not exposed as an MCP tool.
app.get "/internal" do
  { debug: "not exposed" }
end

whoosh mcp              # stdio transport (Claude Desktop, Cursor)
whoosh mcp --list       # list registered MCP tools

Background Jobs

class AnalyzeJob < Whoosh::Job
  inject :db, :llm  # DI injection

  def perform(document_id:)
    doc = db[:documents].where(id: document_id).first
    result = llm.complete("Analyze: #{doc[:text]}")
    db[:documents].where(id: document_id).update(analysis: result)
    { analyzed: true }
  end
end

# Fire and forget
app.post "/analyze" do |req|
  job_id = AnalyzeJob.perform_async(document_id: req.body["id"])
  { job_id: job_id }
end

# Check status
app.get "/jobs/:id" do |req|
  job = Whoosh::Jobs.find(req.params[:id])
  { status: job[:status], result: job[:result] }
end

whoosh worker           # dedicated worker process
whoosh worker -c 4      # 4 threads

File Upload

app.post "/upload" do |req|
  file = req.files["document"]

  file.filename      # => "report.pdf"
  file.content_type  # => "application/pdf"
  file.size          # => 245760
  file.read_text     # => UTF-8 string (for RAG)
  file.to_base64     # => base64 (for vision APIs)
  file.validate!(types: ["application/pdf"], max_size: 10_000_000)

  path = file.save("documents")
  { path: path }
end

Cache

app.get "/users/:id" do |req, cache:|
  cache.fetch("user:#{req.params[:id]}", ttl: 60) do
    db[:users].where(id: req.params[:id]).first
  end
end

Pagination

# Offset-based
app.get "/users" do |req|
  paginate(db[:users].order(:id),
    page: req.query_params["page"], per_page: 20)
end

# Cursor-based (recommended for large datasets)
app.get "/messages" do |req|
  paginate_cursor(db[:messages].order(:id),
    cursor: req.query_params["cursor"], limit: 20)
end

Plugins (18 AI Gems Auto-Discovered)

# Just add gems to Gemfile — they're auto-discovered from Gemfile.lock
gem "ruby_llm"
gem "lingua-ruby"
gem "ner-ruby"
gem "guardrails-ruby"

# Available as bare method calls in endpoints:
app.post "/analyze" do |req|
  lang     = lingua.detect(req.body["text"])
  entities = ner.recognize(req.body["text"])
  { language: lang, entities: entities }
end

HTTP Client

app.post "/proxy" do |req, http:|
  result = http.post("https://api.example.com/analyze",
    json: req.body,
    headers: { "Authorization" => "Bearer #{ENV["API_KEY"]}" },
    timeout: 30
  )
  result.json  # parsed response
end

Prometheus Metrics

Auto-tracked at /metrics:

whoosh_requests_total{method="GET",path="/health",status="200"} 1234
whoosh_request_duration_seconds_sum{path="/health"} 45.23
whoosh_request_duration_seconds_count{path="/health"} 1234

OpenAPI & Docs

app.openapi do
  title "My AI API"
  version "1.0.0"
end

app.docs enabled: true, redoc: true

/docs — Swagger UI
/redoc — ReDoc
/openapi.json — Machine-readable spec

Client Generator

Generate complete, typed, ready-to-run client apps from your Whoosh API — one command.

whoosh generate client react_spa          # React + Vite + TypeScript
whoosh generate client expo               # Expo + React Native
whoosh generate client ios                # SwiftUI + MVVM
whoosh generate client flutter            # Dart + Riverpod + GoRouter
whoosh generate client htmx               # Plain HTML + htmx, no build step
whoosh generate client telegram_bot       # Ruby Telegram bot
whoosh generate client telegram_mini_app  # React + Telegram WebApp SDK

whoosh generate client react_spa --oauth  # Add Google/GitHub/Apple login

The generator introspects your Whoosh app via OpenAPI — it reads your routes, schemas, and auth config, then produces a typed client with:

API client with auth headers and automatic token refresh
Model types matching your schemas
Auth screens (login, register, logout)
CRUD screens for every resource
Navigation and routing
Starter tests

If no Whoosh app exists yet, it scaffolds a standard backend (JWT auth + tasks CRUD) alongside the client.

Client	Stack	Token Storage
`react_spa`	React 19, Vite, TypeScript, React Router	localStorage
`expo`	Expo SDK 52, Expo Router, TypeScript	SecureStore
`ios`	SwiftUI, async/await, MVVM	Keychain
`flutter`	Dart, Dio, Riverpod, GoRouter	flutter_secure_storage
`htmx`	HTML, htmx 2.x, vanilla JS	localStorage
`telegram_bot`	Ruby, telegram-bot-ruby	In-memory session
`telegram_mini_app`	React, Telegram WebApp SDK	Telegram initData

Health Checks

app.health_check do
  probe(:database) { db.test_connection }
  probe(:cache)    { cache.get("ping") || true }
end
# GET /healthz → { "status": "ok", "checks": { "database": "ok" } }

CLI

whoosh new my_api             # scaffold project (with Dockerfile)
whoosh s                      # start server (like rails s)
whoosh s --reload             # hot reload on file changes
whoosh routes                 # list all routes
whoosh describe               # dump app as JSON (AI-friendly)
whoosh check                  # validate config, catch mistakes
whoosh console                # IRB with app loaded
whoosh ci                     # lint + security + audit + tests + coverage
whoosh worker                 # background job worker
whoosh mcp                    # MCP stdio server
whoosh mcp --list             # list all MCP tools

whoosh generate endpoint chat       # endpoint + schema + test
whoosh generate schema User         # schema file
whoosh generate model User name:string email:string
whoosh generate migration add_email_to_users
whoosh generate plugin my_tool      # plugin boilerplate
whoosh generate proto ChatRequest   # .proto file
whoosh generate client react_spa    # full client app (7 types)
whoosh generate client expo --oauth # with OAuth2 social login

whoosh db migrate             # run migrations
whoosh db rollback            # rollback
whoosh db status              # migration status

AI Agent DX

Every whoosh new project includes a CLAUDE.md with all framework patterns, commands, and conventions — so AI agents (Claude Code, Cursor, Copilot) can build with Whoosh immediately.

# Dump your entire app structure as JSON (routes, schemas, config, MCP tools)
whoosh describe

# AI tools can consume this to understand your API
whoosh describe --routes     # routes with request/response schemas
whoosh describe --schemas    # all schema definitions

# Catch mistakes before runtime
whoosh check                 # validates config, auth, dependencies

Performance

Apple Silicon arm64, 12 cores. Ruby 3.4 + YJIT. Full benchmark suite & reproduction steps

How to read these numbers. Benchmarks are selective by nature. A GET /health returning {"status":"ok"} tests the router + serializer, not your real app. A Postgres read tests one query pattern. We show single-process (per-core) numbers first because that's the fair cross-language comparison. Multi-worker numbers are included for deployment sizing, but scaling strategies differ per runtime (Node uses cluster, Python uses multiple workers, Ruby uses workers × threads or fibers) and mixing them isn't apples-to-apples.

HTTP micro-benchmark — `GET /health`

Single process (per-core, fair comparison):

Framework	Language	Server	Req/sec
Fastify	Node.js 22	built-in	69,200
Whoosh	Ruby 3.4 +YJIT	Falcon	24,400
Whoosh	Ruby 3.4 +YJIT	Puma (5 threads)	15,500
FastAPI	Python 3.13	uvicorn	8,900
Sinatra	Ruby 3.4	Puma (5 threads)	7,100

On this microbenchmark, Fastify is ~2.8× Whoosh+Falcon per-core; that's the honest picture for trivial JSON. Against other Ruby frameworks and against FastAPI on CPython, Whoosh is competitive.

Multi-worker (sizing reference, not apples-to-apples):

Framework	Server	Req/sec
Whoosh	Falcon (4 workers)	87,400
Fastify	built-in (single thread, no cluster)	69,200
Whoosh	Puma (4w × 4t)	52,500
Roda	Puma (4w × 4t)	14,700

Fastify was not run under cluster; don't read this table as "Whoosh beats Fastify." Read it as "Whoosh on 4 cores handles ~87K req/s on trivial JSON."

Real-world benchmark — `GET /users/:id` from PostgreSQL (1000-row table)

Single process (per-core):

Framework	Req/sec
Fastify + pg	36,900
Whoosh + Falcon (fiber PG pool)	13,400
Whoosh + Puma (Sequel)	8,600
Roda + Puma	6,700
Sinatra + Puma	4,400
FastAPI + uvicorn	2,400

On realistic DB-bound work, Whoosh's fiber-aware PG pool closes a lot of the gap vs Fastify (~2.75×) and has a wide lead over FastAPI on CPython and over other Ruby frameworks.

Multi-worker (sizing reference):

Framework	Req/sec
Whoosh + Falcon (4 workers, fiber PG pool)	45,900
Fastify (single thread)	36,900

Micro-benchmarks

Component	Throughput
Router lookup (static, cached)	6.1M ops/s
JSON encode (Oj)	5.4M ops/s
Framework overhead	~2.5µs per request

Optimizations: YJIT auto-enabled, Oj JSON auto-detected, O(1) static route cache, compiled middleware chain, pre-frozen headers.

Configuration

# config/app.yml
app:
  name: My API
  port: 9292

database:
  url: <%= ENV.fetch("DATABASE_URL", "sqlite://db/dev.sqlite3") %>
  max_connections: 10

cache:
  store: memory    # memory | redis
  default_ttl: 300

jobs:
  backend: memory  # memory | database | redis
  workers: 2

logging:
  level: info
  format: json

docs:
  enabled: true

.env files loaded automatically (dotenv-compatible).

Testing

require "whoosh/test"

RSpec.describe "My API" do
  include Whoosh::Test

  def app = MyApp.to_rack

  it "creates a user" do
    post_json "/users", { name: "Alice", email: "a@b.com" }
    assert_response 200
    assert_json(name: "Alice")
  end

  it "requires auth" do
    get "/protected"
    assert_response 401
  end

  it "works with auth" do
    get_with_auth "/protected", key: "sk-test"
    assert_response 200
  end
end

License

MIT — see LICENSE.

Contributing

See CONTRIBUTING.md.