sql-chatbot-rails
AI chatbot for any Rails app — auto-discovers schema, indexes code, executes SQL, streams answers via chat widget.
Zero configuration required. Drop the gem in, mount the engine, embed the widget script tag. The chatbot reads your PostgreSQL schema, indexes app/ and lib/, detects routes, and answers natural-language questions about your data.
Quick Start
# Gemfile
gem 'sql-chatbot-rails', '~> 1.0', require: 'sql_chatbot_rails'
bundle install
rails generate sql_chatbot:install
The generator writes config/initializers/sql_chatbot.rb and adds mount SqlChatbot::Engine, at: "/chatbot" to your routes.
Set your API key, then add the widget script tag to app/views/layouts/application.html.erb:
<%= javascript_include_tag "/chatbot/widget.js" %>
Boot the app — a chat bubble appears in the bottom-right corner.
Configuration
config/initializers/sql_chatbot.rb:
SqlChatbot.configure do |c|
# === LLM ===
c.llm_provider = "openai" # or "groq", "ollama"
c.llm_api_key = ENV["OPENAI_API_KEY"]
# c.llm_model = "gpt-4o-mini"
# c.llm_base_url = "https://api.openai.com/v1"
# === Auth ===
c.secret = ENV["CHATBOT_SECRET"] # cookie + Bearer auth
# === Code indexing ===
c.code_paths = ["./app", "./lib"]
# === Domain hints ===
# c.custom_context = "status=3 means Deleted; always exclude deleted rows in lists"
# === Row-level filters injected into every grammar SELECT ===
# c.default_filters = { "*.status" => "!= 3" }
# === Custom entity aliases ===
# c.aliases = { "customer" => "account_user" }
# === Cross-origin (separate frontend repo) ===
# c.allowed_origins = ["https://app.example.com"]
# c.token_lifetime = 900 # JWT TTL in seconds (default: 15 min)
# c.token_secret = ENV["CHATBOT_TOKEN_SECRET"] # auto-generated if nil
# === Grammar engine (deterministic SQL path) ===
# c.grammar_enabled = true
# c.grammar_confidence_threshold = 0.7
end
All configuration keys
| Key | Type | Default | Description |
|---|---|---|---|
llm_provider |
String | "openai" |
"openai", "groq", "ollama" |
llm_api_key |
String | nil | Falls back to LLM_API_KEY / OPENAI_API_KEY / GROQ_API_KEY env |
llm_model |
String | provider preset | gpt-4o-mini, llama-3.3-70b-versatile, llama3.1:8b |
llm_base_url |
String | provider preset | OpenAI-compatible base URL |
secret |
String | nil | Bearer-token / cookie auth secret |
code_paths |
Array<String> | ["./app"] |
Directories to index |
custom_context |
String | nil | Free-form domain notes injected into the SQL prompt |
default_filters |
Hash | {} |
{ "table.column" => "SQL fragment" } injected into every grammar SELECT |
aliases |
Hash | {} |
{ "user-word" => "entity-name" } overrides auto-detected aliases |
allowed_origins |
Array<String> | nil | Cross-origin CORS allowlist (returned + JWT-protected) |
token_lifetime |
Integer | 900 |
JWT TTL in seconds for cross-origin |
token_secret |
String | random | JWT signing secret (set explicitly across replicas) |
grammar_enabled |
Bool | true |
Toggle the deterministic-SQL fast path |
grammar_confidence_threshold |
Float | 0.7 |
Minimum grammar match score before falling back to LLM |
Environment variables
| Var | Purpose |
|---|---|
OPENAI_API_KEY / GROQ_API_KEY / LLM_API_KEY |
API key |
LLM_BASE_URL / LLM_MODEL |
Provider overrides |
CHATBOT_SECRET |
Auth secret |
CHATBOT_TOKEN_SECRET |
JWT signing secret for cross-origin |
Setup by architecture
A. Rails monolith (Rails serves the HTML)
The default. Follow Quick Start above. The widget loads from the same origin as the page, cookie auth works without extra config.
B. Rails API + JS SPA on a different origin
Use cross-origin mode. The widget exchanges a same-origin token from your SPA's API gateway for a short-lived JWT, no third-party cookies needed.
SqlChatbot.configure do |c|
c.llm_api_key = ENV["OPENAI_API_KEY"]
c.secret = ENV["CHATBOT_SECRET"]
c.allowed_origins = ["https://app.example.com"]
c.token_lifetime = 900
c.code_paths = ["./app", "./lib"]
end
In the SPA, load the widget from the API origin:
<script src="https://api.example.com/chatbot/widget.js"></script>
C. Microservices (multiple Rails APIs, one DB each)
Mount the engine in each service. Each instance introspects its own database; the widget script tag in any front-end can target whichever API it asks questions of. Use the same token_secret if you want cross-replica JWT validation.
Domain hints (when auto-detection isn't enough)
custom_context — free-form natural language
c.custom_context = <<~CTX
status=3 means Deleted, always exclude in lists.
is_admin true means staff, false means customer.
invoices.amount is in cents, divide by 100 for dollars.
CTX
The string is injected verbatim into the SQL-generation prompt. Use this for quirks that the schema can't express on its own.
default_filters — structured row-level convention
Applies to every grammar-generated SELECT:
c.default_filters = {
"*.status" => "!= 3", # any table with a status column
"users.deleted_at" => "IS NULL", # specific column
"invoices.cancelled_at" => "IS NULL",
}
Keys are "table.column" or "*.column". Values are SQL fragments placed after the qualified column reference.
aliases — map user words to entity names
When users say a word your tables don't use:
c.aliases = {
"customer" => "account_user", # Saleor calls them account_user
"customers" => "account_user",
"agent" => "user", # Chatwoot agents are users with a role
"agents" => "user",
}
Always wins over auto-detected aliases on conflict.
Code indexing
Powers route detection (for navigation answers) and enum/constant discovery (Rails enum, Active Storage variants, scopes, business-rule constants).
| Field | Default |
|---|---|
code_paths |
["./app"] |
Recommended for Rails: ["./app", "./lib", "./config"].
The indexer skips vendor, node_modules, tmp, log, .git. Maximum 2000 files.
Re-index after deploys: POST /chatbot/api/refresh.
Schema introspection
Auto-detects on boot:
- Soft-delete columns —
deleted_at,discarded_at,archived_at,removed_at(Paranoia, Discard) - Polymorphic associations —
*_type VARCHAR+*_id BIGINTpairs - Lookup tables — small (<50 row) tables referenced by FK
- PG enum types — native
CREATE TYPE … AS ENUM - CHECK-constraint enums —
CHECK(col IN (…))patterns
Rails-level enum declarations are picked up by the code indexer.
Authentication
When c.secret is set:
- The widget bundle (
/chatbot/widget.js) sets achatbot_tokencookie on load. - API endpoints (
/chatbot/api/ask,/chatbot/api/refresh) require the cookie orAuthorization: Bearer <secret>. /chatbot/api/healthstays open.
Without c.secret, the chatbot API is open to anyone — fine for development, not for production.
API endpoints
When mounted at /chatbot:
| Endpoint | Method | Description |
|---|---|---|
/chatbot/widget.js |
GET | Serves the widget bundle |
/chatbot/api/ask |
POST | SSE-streamed chat endpoint |
/chatbot/api/health |
GET | { status, tables, code_files } |
/chatbot/api/refresh |
POST | Re-introspect schema + re-index code |
Security
- Every query runs inside
BEGIN; SET TRANSACTION READ ONLY; …— writes are physically impossible. DROP,DELETE,TRUNCATE,ALTER,INSERT,UPDATE,GRANT,CREATE,COMMENTblocked at parse time.pg_catalogandinformation_schemablocked from generated queries.- Sensitive columns (
password,secret,api_key,ssn,credit_card, etc.) filtered out before the schema is sent to the LLM.
Requirements
License
MIT