llmdb
A Ruby gem for talking to your PostgreSQL, MySQL, or Oracle database in plain language. The agent discovers the schema on demand, writes its own SQL, and executes it safely under a configurable permission model. Works with local LLMs (LM Studio, Ollama) and cloud providers (Anthropic, OpenAI, Gemini, etc.) through ruby_llm.
LLMDB.configure do |c|
c.adapter, c.database, c.username, c.password = :postgresql, "shop", "ro_user", ENV["PG_PASS"]
c.llm_provider = :anthropic
c.llm_model = "claude-opus-4-7"
c.llm_api_key = ENV["ANTHROPIC_API_KEY"]
end
session = LLMDB::Session.new
response = session.ask("How many orders did our top 5 customers place last month?")
puts response.content
Why this gem
The Ruby ecosystem already has Boxcars and Langchain.rb, both of which can talk to a database. llmdb is narrower and makes different trade-offs:
- No ActiveRecord required. Schema and foreign-key relationships are read directly from the database via Sequel — you can point it at any database, including ones owned by other teams or other languages.
- Schema is discovered, not declared. The agent uses tool calls (
list_tables,describe_table,execute_query) to learn the schema as needed. No glossary, no preloaded prompt, no manual mapping. What it learns persists in the conversation context. - Multi-layered query safety. Three permission modes with real teeth —
:read_onlyis enforced at the database engine level (SET TRANSACTION READ ONLY/PRAGMA query_only), not just by checking the first SQL token. - Provider-agnostic LLM. Drop-in compatible with anything
ruby_llmsupports: local models via Ollama or LM Studio, or cloud providers like Anthropic, OpenAI, Gemini.
If you want a full agent framework with vector search, RAG, and multi-step planning — use Boxcars or Langchain.rb. If you just want a focused, local-first natural-language SQL agent for one of three databases, this gem fits.
Installation
Add to your Gemfile:
gem "llmdb"
…and add the database driver you actually use (none are required by default):
gem "pg", ">= 1.0" # PostgreSQL
gem "mysql2", ">= 0.5" # MySQL
gem "ruby-oci8", ">= 2.2" # Oracle (also requires Oracle Instant Client)
Then bundle install.
Quick start
require "llmdb"
LLMDB.configure do |c|
# Database
c.adapter = :postgresql # :postgresql | :mysql | :oracle
c.host = "localhost"
c.port = 5432
c.database = "shop"
c.username = "ro_user"
c.password = ENV["PG_PASS"]
# LLM
c.llm_provider = :ollama # :ollama, :openai, :anthropic, :gemini, ...
c.llm_model = "llama3.2"
c.llm_api_base = "http://localhost:11434"
c.llm_assume_model_exists = true # local/custom models not in RubyLLM's registry
end
session = LLMDB::Session.new
session.ask("Which products had the highest revenue in Q1?")
LLMDB::Session.new accepts:
- nothing — uses the global
LLMDB.configuration - a
Hashof options — built into a freshConfiguration - a
LLMDB::Configurationinstance — used directly
LLM provider examples
LM Studio (OpenAI-compatible local server)
LLMDB.configure do |c|
# ...DB config...
c.llm_provider = :openai
c.llm_api_base = "http://localhost:1234/v1"
c.llm_api_key = "lm-studio" # any non-empty string; LM Studio ignores it
c.llm_model = "qwen2.5-coder-32b-instruct"
c.llm_assume_model_exists = true
end
Ollama
LLMDB.configure do |c|
# ...DB config...
c.llm_provider = :ollama
c.llm_api_base = "http://localhost:11434"
c.llm_model = "llama3.2"
c.llm_assume_model_exists = true
end
Anthropic
LLMDB.configure do |c|
# ...DB config...
c.llm_provider = :anthropic
c.llm_api_key = ENV["ANTHROPIC_API_KEY"]
c.llm_model = "claude-opus-4-7"
end
OpenAI
LLMDB.configure do |c|
# ...DB config...
c.llm_provider = :openai
c.llm_api_key = ENV["OPENAI_API_KEY"]
c.llm_model = "gpt-5"
end
If you'd rather configure RubyLLM yourself (Rails initializer, etc.) and skip the provider plumbing, omit llm_api_base and llm_api_key from the gem config — the gem will not touch RubyLLM.configure and will defer to whatever you set globally.
Permission modes
Three modes for query execution safety:
| Mode | SELECT/WITH/EXPLAIN | INSERT/UPDATE/DELETE/DDL |
|---|---|---|
:read_only (default) |
run | blocked |
:ask |
run | confirmation required |
:full |
run | run |
LLMDB.configure { |c| c. = :ask }
:read_only — defense in depth
Two independent layers protect against writes:
- First-token whitelist. Anything that doesn't start with
SELECT,WITH, orEXPLAINfails fast with a clearSafetyErrorbefore ever reaching the database. - DB-engine enforcement. Even if the token check is bypassed (e.g. a
WITH x AS (DELETE FROM ...) SELECT ...CTE-DML in PostgreSQL, or a side-effect function call), the query runs inside aSET TRANSACTION READ ONLYtransaction (PG/MySQL/Oracle) or aPRAGMA query_only = ONconnection (SQLite). The DB engine itself refuses any write.
This means :read_only is genuinely safe — not just heuristically safe.
For belt-and-suspenders security, also use a read-only database role at the credential level. The gem can't undo what your DB user is allowed to do.
:ask — LLM-judged confirmation
Each query is classified by a stateless LLM judge (LLMDB::WriteClassifier) running in a fresh chat context with one job: decide whether the SQL writes. This catches CTE-wrapped DML, side-effect functions, and other constructs a first-token check would miss, without the cost of a database round-trip per query.
If the judge says "write", the configured confirm_callback is invoked. The default is an interactive TTY prompt:
[LLMDB] Pending non-read-only query:
| UPDATE products SET price = price * 1.1 WHERE category = 'electronics'
[LLMDB] Execute? [y/N]
Override the confirmation flow for non-interactive contexts (web app, job queue, Slack bot, etc.):
LLMDB.configure do |c|
c. = :ask
c.confirm_callback = ->(sql) { MyApprovalService.ask(sql, user: current_user) }
end
The judge defaults to the same model as the agent — fine for local LLMs where calls are essentially free, but you may prefer a smaller dedicated classifier when using a large cloud model:
LLMDB.configure do |c|
# ...main config using claude-opus-4-7...
classifier_config = LLMDB::Configuration.new(
adapter: c.adapter, database: c.database, username: c.username,
llm_provider: :anthropic, llm_model: "claude-haiku-4-5" # smaller, cheaper, faster
)
c.write_classifier = LLMDB::WriteClassifier.new(classifier_config)
end
Or supply any callable that takes a SQL string and returns a boolean — useful for plugging in a deterministic SQL parser like pg_query:
require "pg_query"
c.write_classifier = lambda do |sql|
PgQuery.parse(sql).tree.stmts.any? { |s| s.stmt.node != :select_stmt }
end
The classifier fails closed: if the LLM call errors or times out, write? returns true so the user is asked rather than the query running silently.
:full — no restrictions
Anything goes. Use at your own risk, ideally with a database user whose grants already constrain what the agent can do.
System prompts
Two slots, layered:
LLMDB.configure do |c|
# Optional — replaces the gem's built-in tool/schema-discovery instructions.
# Most users never set this.
c.default_system_prompt = nil
# The prompt you'll typically set: project-specific context appended on top
# of the default. Domain glossary, table conventions, language preference.
c.system_prompt = <<~PROMPT
Business glossary:
- "customer" in business sense = `accounts` table (not `users`)
- tables prefixed `tmp_` are ETL buffers — ignore them
- always filter `accounts.deleted_at IS NULL` (soft delete)
Respond concisely in English.
PROMPT
end
Both are passed to RubyLLM as separate system messages (the second via with_instructions(prompt, append: true)), so the model sees them as distinct layers.
How it works
┌──────────┐ natural-language question ┌────────────────┐
│ user │────────────────────────────▶│ LLMDB::Session│
└──────────┘ └────────┬────────┘
│
┌───────────────▼───────────────┐
│ RubyLLM::Chat │
│ + system prompts │
│ + tools: list_tables, │
│ describe_table, │
│ execute_query │
└───────────────┬───────────────┘
│ tool calls
┌───────────────▼───────────────┐
│ LLMDB::Connection │
│ → permission_mode dispatch │
│ → confirm_callback │
│ → write_classifier (in :ask) │
└───────────────┬───────────────┘
│
┌───────────────▼───────────────┐
│ LLMDB::Adapters::* │
│ (Sequel-backed: PG/MySQL/ │
│ Oracle) │
└───────────────────────────────┘
The agent has no advance knowledge of the database. On each new question:
- It calls
list_tablesto see what's available. - It calls
describe_tablefor tables it intends to query, getting columns, types, and foreign-key relationships. - It writes an SQL query and submits it to
execute_query. - It reasons about the result and answers the user.
Within a single conversation, schema knowledge accumulates in the chat history, so the agent doesn't re-discover tables it has already seen.
Configuration reference
LLMDB.configure do |c|
# --- Database ---
c.adapter = :postgresql # :postgresql | :mysql | :oracle
c.host = "localhost"
c.port = 5432 # default depends on adapter
c.database = "mydb"
c.username = "user"
c.password = "secret" # nil for trust auth or socket connections
# --- LLM (provider-agnostic) ---
c.llm_provider = :openai # any provider RubyLLM supports
c.llm_model = "gpt-5"
c.llm_api_base = nil # custom endpoint (LM Studio, vLLM, LiteLLM, ...)
c.llm_api_key = ENV["OPENAI_API_KEY"]
c.llm_assume_model_exists = false # set true for models outside RubyLLM's registry
# --- Safety ---
c. = :read_only # :read_only | :ask | :full
c.confirm_callback = nil # callable(sql) -> bool, used in :ask mode
# default: interactive TTY prompt
c.write_classifier = nil # callable(sql) -> bool, used in :ask mode
# default: WriteClassifier (LLM-judge)
c.max_rows = 500 # cap on rows returned to the LLM per query
# --- Prompts ---
c.default_system_prompt = nil # replaces built-in agent instructions
c.system_prompt = nil # appended on top of the default
end
Architecture notes
- Top-level constant is
LLMDB(treating "DB" as an initialism). The gem registers a Zeitwerk inflector to make this work with thellmdb.rbfilename. - Errors are loaded eagerly.
lib/llmdb/errors.rbdefines several constants (Error,ConfigurationError,SafetyError,QueryError,ConnectionError) directly underLLMDB, which Zeitwerk can't autoload by convention — the file isloader.ignored and required up front inlib/llmdb.rb. - No mutation of RubyLLM global config unless you ask for it.
apply_llm_credentialsis skipped entirely when bothllm_api_baseandllm_api_keyare nil — useful when you configure RubyLLM elsewhere (e.g. a Rails initializer). - Tools are registered as instances, not classes. Each tool takes the
Connectionas a constructor argument;Sessionwires them up viachat.with_tools(...).
Limitations
- The DB-level read-only enforcement is per-connection (SQLite) or per-transaction (PG/MySQL/Oracle). Other backends fall back to the first-token whitelist with a
STDERRwarning so the gap is visible in CI. - The LLM-judge classifier in
:askmode is non-deterministic. It's meaningfully more robust than a token check (it understands CTE-wrapped DML, side-effect functions, etc.), but ~99% accuracy on a small local model is still ~99%, not 100%. For zero-tolerance environments, use:read_onlyplus a read-only database role. mysql2does not currently compile on Ruby 4.0 (as of writing). Use Ruby 3.2–3.3 for MySQL until upstream catches up.- Oracle support requires
ruby-oci8and the Oracle Instant Client to be installed on the host.
Development
bundle install
bundle exec rspec # 80+ examples, no DB required for unit tests
bundle exec rubocop
The test suite uses an in-memory SQLite database to exercise the adapter base class. Real PostgreSQL/MySQL/Oracle integration tests are not part of the default suite.
License
MIT.