Whoosh — AI-First Ruby API Framework

Whoosh

The fastest way to ship a production MCP server in Ruby.
A FastAPI-style framework with MCP, schema validation, auth, streaming, and OpenAPI — built in.

Ruby Rack License Tests Stability


Why Whoosh?

  • MCP-native — opt routes into MCP with mcp: true and they become typed tools over stdio/SSE. No glue code, no separate server.
  • FastAPI-style DSL in Ruby — declarative schemas, typed request/response, auto-generated OpenAPI + Swagger UI, dependency injection.
  • Batteries included — auth (JWT, API key, OAuth), rate limiting, caching, background jobs, file uploads, vector search, streaming, pagination.
  • Agent-friendlywhoosh describe emits a JSON snapshot of your app; generated CLAUDE.md so coding agents understand it; whoosh check validates config before runtime.
  • Competitive Ruby performance — YJIT + Falcon fibers + Oj, ~2.5µs framework overhead. See Performance for honest, per-core comparisons.

When NOT to use Whoosh

Whoosh is on 1.x but still evolving — solo-maintained, without a production track record yet, and breaking changes ship occasionally (always called out in CHANGELOG.md). Reach for something else when:

  • You need a managed backend. Supabase, PocketBase, or Firebase give you DB + auth + realtime without hosting a framework. Whoosh is the app layer — use it with a managed DB if that fits.
  • You want maximum ecosystem depth. Rails has more gems; FastAPI has the Python ML/AI library ecosystem (PyTorch, transformers, LangChain). If your core workload lives in those libraries, stay where they are.
  • You need a frozen API surface. Being on 1.x doesn't mean the API is locked — breaking changes still ship when the design calls for it. If you need strict stability contracts today, wait a few releases.
  • Your team has no Ruby experience and the project isn't specifically about AI/MCP. Hiring and ecosystem gravity usually beat framework features.

Whoosh's sweet spot: Ruby shops (or Ruby-curious teams) building AI / LLM / MCP-backed APIs who want typed schemas, OpenAPI, and MCP without wiring three libraries together.

Install

gem install whoosh
whoosh new my_api
cd my_api
whoosh s

Open http://localhost:9292/docs for Swagger UI.

Quick Start

# app.rb
require "whoosh"

app = Whoosh::App.new

app.get "/health" do
  { status: "ok", version: Whoosh::VERSION }
end

app.post "/chat", request: ChatRequest, mcp: true do |req|
  stream_llm do |out|
    llm.chat(req.body[:message]).each_chunk { |c| out << c }
    out.finish
  end
end
whoosh s              # Start server
whoosh s --reload     # Auto-reload on file changes
whoosh s -p 3000      # Custom port

Features

Routing

# Inline
app.get("/users/:id") { |req| { id: req.params[:id] } }

# Class-based
class ChatEndpoint < Whoosh::Endpoint
  post "/chat", request: ChatRequest, mcp: true

  def call(req)
    { reply: "Hello!" }
  end
end
app.load_endpoints("endpoints/")

# Groups with shared middleware
app.group "/api/v1", mcp: true do
  get("/status") { { ok: true } }
  post("/analyze", auth: :api_key) { |req| analyze(req) }
end

Schema Validation

class CreateUserRequest < Whoosh::Schema
  field :name,  String,  required: true, desc: "User name"
  field :email, String,  required: true, desc: "Email address"
  field :age,   Integer, min: 0, max: 150
  field :role,  String,  default: "user"
end

# Returns 422 with field-level errors on invalid input
app.post "/users", request: CreateUserRequest do |req|
  { name: req.body[:name], created: true }
end

Authentication & Security

app.auth do
  api_key header: "X-Api-Key", keys: {
    "sk-prod-123" => { role: :premium },
    "sk-free-456" => { role: :free }
  }
  jwt secret: ENV["JWT_SECRET"], algorithm: :hs256
end

app.rate_limit do
  default limit: 60, period: 60
  rule "/chat", limit: 10, period: 60
  tier :free,    limit: 100,  period: 3600
  tier :premium, limit: 5000, period: 3600
  on_store_failure :fail_open
end

app.access_control do
  role :free,    models: ["claude-haiku"]
  role :premium, models: ["claude-haiku", "claude-sonnet", "claude-opus"]
end

AI / LLM Integration

# Chat with any LLM (auto-detects ruby_llm gem)
app.post "/chat" do |req, llm:|
  { reply: llm.chat(req.body["message"]) }
end

# Structured output — LLM returns validated JSON
app.post "/extract" do |req, llm:|
  data = llm.extract(req.body["text"], schema: InvoiceSchema)
  { invoice: data }
end

# RAG in 3 lines
app.post "/ask" do |req, vectors:, llm:|
  context = vectors.search("knowledge", vector: embed(req.body["q"]), limit: 5)
  { answer: llm.chat(req.body["q"], system: "Context: #{context}") }
end
app.post "/index" do |req, vectors:|
  vectors.insert("docs", id: req.body["id"],
    vector: req.body["embedding"],
    metadata: { title: req.body["title"] })
  { indexed: true }
end

app.post "/search" do |req, vectors:|
  results = vectors.search("docs", vector: req.body["embedding"], limit: 10)
  { results: results }
end

In-memory by default (cosine similarity). Install zvec gem for production-grade HNSW index.

LLM Streaming (OpenAI-compatible)

app.post "/chat/stream", auth: :api_key do |req|
  stream_llm do |out|
    # True chunked streaming via SizedQueue — tokens flow in real-time
    out << "Hello "
    out << "World!"
    out.finish  # sends data: [DONE]
  end
end

# SSE events
app.get "/events" do
  stream :sse do |out|
    out.event("status", { connected: true })
    out << { data: "hello" }
  end
end

MCP (Model Context Protocol)

Routes are exposed as MCP tools only when you opt in with mcp: true. This prevents internal or admin endpoints from being callable as tools by accident.

# Opt in per route:
app.post "/summarize", request: SummarizeRequest, mcp: true do |req|
  { summary: llm.summarize(req.body[:text]) }
end

# Or opt in a whole group:
app.group "/tools", mcp: true do
  post "/translate" do |req|
    { result: translate(req.body["text"]) }
  end
end

# Default: not exposed as an MCP tool.
app.get "/internal" do
  { debug: "not exposed" }
end
whoosh mcp              # stdio transport (Claude Desktop, Cursor)
whoosh mcp --list       # list registered MCP tools

Background Jobs

class AnalyzeJob < Whoosh::Job
  inject :db, :llm  # DI injection

  def perform(document_id:)
    doc = db[:documents].where(id: document_id).first
    result = llm.complete("Analyze: #{doc[:text]}")
    db[:documents].where(id: document_id).update(analysis: result)
    { analyzed: true }
  end
end

# Fire and forget
app.post "/analyze" do |req|
  job_id = AnalyzeJob.perform_async(document_id: req.body["id"])
  { job_id: job_id }
end

# Check status
app.get "/jobs/:id" do |req|
  job = Whoosh::Jobs.find(req.params[:id])
  { status: job[:status], result: job[:result] }
end
whoosh worker           # dedicated worker process
whoosh worker -c 4      # 4 threads

File Upload

app.post "/upload" do |req|
  file = req.files["document"]

  file.filename      # => "report.pdf"
  file.content_type  # => "application/pdf"
  file.size          # => 245760
  file.read_text     # => UTF-8 string (for RAG)
  file.to_base64     # => base64 (for vision APIs)
  file.validate!(types: ["application/pdf"], max_size: 10_000_000)

  path = file.save("documents")
  { path: path }
end

Cache

app.get "/users/:id" do |req, cache:|
  cache.fetch("user:#{req.params[:id]}", ttl: 60) do
    db[:users].where(id: req.params[:id]).first
  end
end

Pagination

# Offset-based
app.get "/users" do |req|
  paginate(db[:users].order(:id),
    page: req.query_params["page"], per_page: 20)
end

# Cursor-based (recommended for large datasets)
app.get "/messages" do |req|
  paginate_cursor(db[:messages].order(:id),
    cursor: req.query_params["cursor"], limit: 20)
end

Plugins (18 AI Gems Auto-Discovered)

# Just add gems to Gemfile — they're auto-discovered from Gemfile.lock
gem "ruby_llm"
gem "lingua-ruby"
gem "ner-ruby"
gem "guardrails-ruby"

# Available as bare method calls in endpoints:
app.post "/analyze" do |req|
  lang     = lingua.detect(req.body["text"])
  entities = ner.recognize(req.body["text"])
  { language: lang, entities: entities }
end

HTTP Client

app.post "/proxy" do |req, http:|
  result = http.post("https://api.example.com/analyze",
    json: req.body,
    headers: { "Authorization" => "Bearer #{ENV["API_KEY"]}" },
    timeout: 30
  )
  result.json  # parsed response
end

Prometheus Metrics

Auto-tracked at /metrics:

whoosh_requests_total{method="GET",path="/health",status="200"} 1234
whoosh_request_duration_seconds_sum{path="/health"} 45.23
whoosh_request_duration_seconds_count{path="/health"} 1234

OpenAPI & Docs

app.openapi do
  title "My AI API"
  version "1.0.0"
end

app.docs enabled: true, redoc: true
  • /docs — Swagger UI
  • /redoc — ReDoc
  • /openapi.json — Machine-readable spec

Client Generator

Generate complete, typed, ready-to-run client apps from your Whoosh API — one command.

whoosh generate client react_spa          # React + Vite + TypeScript
whoosh generate client expo               # Expo + React Native
whoosh generate client ios                # SwiftUI + MVVM
whoosh generate client flutter            # Dart + Riverpod + GoRouter
whoosh generate client htmx               # Plain HTML + htmx, no build step
whoosh generate client telegram_bot       # Ruby Telegram bot
whoosh generate client telegram_mini_app  # React + Telegram WebApp SDK

whoosh generate client react_spa --oauth  # Add Google/GitHub/Apple login

The generator introspects your Whoosh app via OpenAPI — it reads your routes, schemas, and auth config, then produces a typed client with:

  • API client with auth headers and automatic token refresh
  • Model types matching your schemas
  • Auth screens (login, register, logout)
  • CRUD screens for every resource
  • Navigation and routing
  • Starter tests

If no Whoosh app exists yet, it scaffolds a standard backend (JWT auth + tasks CRUD) alongside the client.

Client Stack Token Storage
react_spa React 19, Vite, TypeScript, React Router localStorage
expo Expo SDK 52, Expo Router, TypeScript SecureStore
ios SwiftUI, async/await, MVVM Keychain
flutter Dart, Dio, Riverpod, GoRouter flutter_secure_storage
htmx HTML, htmx 2.x, vanilla JS localStorage
telegram_bot Ruby, telegram-bot-ruby In-memory session
telegram_mini_app React, Telegram WebApp SDK Telegram initData

Health Checks

app.health_check do
  probe(:database) { db.test_connection }
  probe(:cache)    { cache.get("ping") || true }
end
# GET /healthz → { "status": "ok", "checks": { "database": "ok" } }

CLI

whoosh new my_api             # scaffold project (with Dockerfile)
whoosh s                      # start server (like rails s)
whoosh s --reload             # hot reload on file changes
whoosh routes                 # list all routes
whoosh describe               # dump app as JSON (AI-friendly)
whoosh check                  # validate config, catch mistakes
whoosh console                # IRB with app loaded
whoosh ci                     # lint + security + audit + tests + coverage
whoosh worker                 # background job worker
whoosh mcp                    # MCP stdio server
whoosh mcp --list             # list all MCP tools

whoosh generate endpoint chat       # endpoint + schema + test
whoosh generate schema User         # schema file
whoosh generate model User name:string email:string
whoosh generate migration add_email_to_users
whoosh generate plugin my_tool      # plugin boilerplate
whoosh generate proto ChatRequest   # .proto file
whoosh generate client react_spa    # full client app (7 types)
whoosh generate client expo --oauth # with OAuth2 social login

whoosh db migrate             # run migrations
whoosh db rollback            # rollback
whoosh db status              # migration status

AI Agent DX

Every whoosh new project includes a CLAUDE.md with all framework patterns, commands, and conventions — so AI agents (Claude Code, Cursor, Copilot) can build with Whoosh immediately.

# Dump your entire app structure as JSON (routes, schemas, config, MCP tools)
whoosh describe

# AI tools can consume this to understand your API
whoosh describe --routes     # routes with request/response schemas
whoosh describe --schemas    # all schema definitions

# Catch mistakes before runtime
whoosh check                 # validates config, auth, dependencies

Performance

Apple Silicon arm64, 12 cores. Ruby 3.4 + YJIT. Full benchmark suite & reproduction steps

How to read these numbers. Benchmarks are selective by nature. A GET /health returning {"status":"ok"} tests the router + serializer, not your real app. A Postgres read tests one query pattern. We show single-process (per-core) numbers first because that's the fair cross-language comparison. Multi-worker numbers are included for deployment sizing, but scaling strategies differ per runtime (Node uses cluster, Python uses multiple workers, Ruby uses workers × threads or fibers) and mixing them isn't apples-to-apples.

HTTP micro-benchmark — GET /health

Single process (per-core, fair comparison):

Framework Language Server Req/sec
Fastify Node.js 22 built-in 69,200
Whoosh Ruby 3.4 +YJIT Falcon 24,400
Whoosh Ruby 3.4 +YJIT Puma (5 threads) 15,500
FastAPI Python 3.13 uvicorn 8,900
Sinatra Ruby 3.4 Puma (5 threads) 7,100

On this microbenchmark, Fastify is ~2.8× Whoosh+Falcon per-core; that's the honest picture for trivial JSON. Against other Ruby frameworks and against FastAPI on CPython, Whoosh is competitive.

Multi-worker (sizing reference, not apples-to-apples):

Framework Server Req/sec
Whoosh Falcon (4 workers) 87,400
Fastify built-in (single thread, no cluster) 69,200
Whoosh Puma (4w × 4t) 52,500
Roda Puma (4w × 4t) 14,700

Fastify was not run under cluster; don't read this table as "Whoosh beats Fastify." Read it as "Whoosh on 4 cores handles ~87K req/s on trivial JSON."

Real-world benchmark — GET /users/:id from PostgreSQL (1000-row table)

Single process (per-core):

Framework Req/sec
Fastify + pg 36,900
Whoosh + Falcon (fiber PG pool) 13,400
Whoosh + Puma (Sequel) 8,600
Roda + Puma 6,700
Sinatra + Puma 4,400
FastAPI + uvicorn 2,400

On realistic DB-bound work, Whoosh's fiber-aware PG pool closes a lot of the gap vs Fastify (~2.75×) and has a wide lead over FastAPI on CPython and over other Ruby frameworks.

Multi-worker (sizing reference):

Framework Req/sec
Whoosh + Falcon (4 workers, fiber PG pool) 45,900
Fastify (single thread) 36,900

Micro-benchmarks

Component Throughput
Router lookup (static, cached) 6.1M ops/s
JSON encode (Oj) 5.4M ops/s
Framework overhead ~2.5µs per request

Optimizations: YJIT auto-enabled, Oj JSON auto-detected, O(1) static route cache, compiled middleware chain, pre-frozen headers.

Configuration

# config/app.yml
app:
  name: My API
  port: 9292

database:
  url: <%= ENV.fetch("DATABASE_URL", "sqlite://db/dev.sqlite3") %>
  max_connections: 10

cache:
  store: memory    # memory | redis
  default_ttl: 300

jobs:
  backend: memory  # memory | database | redis
  workers: 2

logging:
  level: info
  format: json

docs:
  enabled: true

.env files loaded automatically (dotenv-compatible).

Testing

require "whoosh/test"

RSpec.describe "My API" do
  include Whoosh::Test

  def app = MyApp.to_rack

  it "creates a user" do
    post_json "/users", { name: "Alice", email: "a@b.com" }
    assert_response 200
    assert_json(name: "Alice")
  end

  it "requires auth" do
    get "/protected"
    assert_response 401
  end

  it "works with auth" do
    get_with_auth "/protected", key: "sk-test"
    assert_response 200
  end
end

License

MIT — see LICENSE.

Contributing

See CONTRIBUTING.md.