Class: RailsAiContext::Tools::Query
- Defined in:
- lib/rails_ai_context/tools/query.rb
Defined Under Namespace
Classes: ResultProxy
Constant Summary collapse
- BLOCKED_KEYWORDS =
── Layer 1: SQL validation ─────────────────────────────────────
/\b(INSERT|UPDATE|DELETE|DROP|ALTER|TRUNCATE|CREATE|GRANT|REVOKE|SET|COPY|MERGE|REPLACE)\b/i- BLOCKED_CLAUSES =
/\bFOR\s+(UPDATE|SHARE|NO\s+KEY\s+UPDATE)\b/i- BLOCKED_SHOWS =
/\bSHOW\s+(GRANTS|PROCESSLIST|BINLOG|SLAVE|MASTER|REPLICAS)\b/i- SELECT_INTO =
/\bSELECT\b[^;]*\bINTO\b/i- MULTI_STATEMENT =
/;\s*\S/- ALLOWED_PREFIX =
/\A\s*(SELECT|WITH|SHOW|EXPLAIN|DESCRIBE|DESC)\b/i- BLOCKED_FUNCTIONS =
SELECT-callable functions that give the caller a filesystem / network exfiltration primitive even though the query is technically a SELECT. These pass ‘SET TRANSACTION READ ONLY` because they’re reads from the DB engine’s perspective, but they bypass the gem’s ‘sensitive_patterns` file allowlist entirely by pivoting through the database process.
Postgres: pg_read_file / pg_read_binary_file / pg_ls_dir / pg_stat_file
(file read), lo_import / lo_export (large-object I/O), dblink* (cross-db exfiltration), COPY ... FROM/TO PROGRAM (arbitrary command execution when COPY permissions permit).MySQL: LOAD_FILE (scalar file read outside SELECT INTO contexts). SQLite: load_extension (shared-library load — disabled by default
but harden in defense). /\b( pg_read_binary_file | pg_read_file | pg_ls_dir | pg_ls_logdir | pg_ls_tmpdir | pg_ls_waldir | pg_ls_archive_statusdir | pg_stat_file | pg_file_settings | pg_current_logfile | lo_import | lo_export | dblink[a-z_]* | LOAD\s+DATA | load_file | load_extension )\b/ix- BLOCKED_OUTPUT =
MySQL ‘SELECT … INTO OUTFILE ’path’‘ / `INTO DUMPFILE ’path’‘ are caught by SELECT_INTO already, but make an explicit pattern so the error message is accurate.
/\bINTO\s+(OUTFILE|DUMPFILE)\b/i- SENSITIVE_COLUMN_SUFFIXES =
Defense against the column-aliasing redaction bypass:
SELECT password_digest AS x FROM users -- bypasses result.columns redaction SELECT substring(password_digest, 1, 60) ... -- column name becomes "substring" SELECT md5(session_data) FROM sessions -- column name becomes "md5"Post-execution redaction operates on the column names the DB returns, which the caller controls via aliases and expressions. The only defense that works is to reject queries that TEXTUALLY reference any sensitive column, before execution. Users who need to query non-sensitive columns with a similar name can subtract from ‘config.query_redacted_columns` in an initializer.
%w[ password_digest password_hash encrypted_password password_reset_token confirmation_token unlock_token remember_token reset_password_token api_key api_secret access_token refresh_token jti otp_secret session_data secret_key secret private_key ].freeze
- TAUTOLOGY_PATTERNS =
SQL injection tautology patterns: OR 1=1, OR true, OR ”=”, UNION SELECT, etc.
[ /\bOR\s+1\s*=\s*1\b/i, /\bOR\s+true\b/i, /\bOR\s+'[^']*'\s*=\s*'[^']*'/i, /\bOR\s+"[^"]*"\s*=\s*"[^"]*"/i, /\bOR\s+\d+\s*=\s*\d+/i, /\bUNION\s+(ALL\s+)?SELECT\b/i ].freeze
- HARD_ROW_CAP =
1000
Constants inherited from BaseTool
BaseTool::SESSION_CONTEXT, BaseTool::SHARED_CACHE
Class Method Summary collapse
- .call(sql: nil, limit: nil, format: "table", explain: false, server_context: nil, **_extra) ⇒ Object
-
.references_sensitive_column?(cleaned_sql) ⇒ Boolean
Returns the first sensitive column name referenced by the SQL, or nil.
-
.strip_sql_comments(sql) ⇒ Object
── SQL comment stripping ───────────────────────────────────────.
-
.validate_sql(sql) ⇒ Object
── SQL validation (Layer 1) ────────────────────────────────────.
Methods inherited from BaseTool
abstract!, abstract?, cache_key, cached_context, config, extract_method_source_from_file, extract_method_source_from_string, find_closest_match, fuzzy_find_key, inherited, not_found_response, paginate, rails_app, registered_tools, reset_all_caches!, reset_cache!, session_queries, session_record, session_reset!, set_call_params, text_response
Class Method Details
.call(sql: nil, limit: nil, format: "table", explain: false, server_context: nil, **_extra) ⇒ Object
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
# File 'lib/rails_ai_context/tools/query.rb', line 111 def self.call(sql: nil, limit: nil, format: "table", explain: false, server_context: nil, **_extra) set_call_params(sql: sql&.truncate(60)) # ── Environment guard ─────────────────────────────────────── unless config.allow_query_in_production || !Rails.env.production? return text_response( "rails_query is disabled in production for data privacy. " \ "Set config.allow_query_in_production = true to override." ) end # ── ActiveRecord guard (api-only apps) ────────────────────── # Must come BEFORE any code that rescues ActiveRecord::* — Ruby # resolves rescue class constants at raise time, and `rescue # ActiveRecord::ConnectionNotEstablished` crashes with NameError # on apps where ActiveRecord is not loaded (e.g. # `rails new --api --skip-active-record`). unless defined?(ActiveRecord::Base) return text_response( "Database queries are unavailable: ActiveRecord is not loaded in this app. " \ "This happens on API-only apps created with `rails new --api --skip-active-record`. " \ "rails_query requires a database connection to function." ) end # ── Layer 1: SQL validation ───────────────────────────────── valid, error = validate_sql(sql) return text_response(error) unless valid # ── EXPLAIN mode ──────────────────────────────────────────── if explain return execute_explain(sql.strip, config.query_timeout) end # Resolve row limit row_limit = limit ? [ limit.to_i, HARD_ROW_CAP ].min : config.query_row_limit row_limit = [ row_limit, 1 ].max timeout_seconds = config.query_timeout # ── Layers 2-3: Execute with DB-level safety + row limit ──── result = execute_safely(sql.strip, row_limit, timeout_seconds) # ── Layer 4: Redact sensitive columns ─────────────────────── redacted = redact_results(result) # ── Format output ─────────────────────────────────────────── output = case format when "csv" format_csv(redacted) else format_table(redacted) end text_response(output) rescue ActiveRecord::ConnectionNotEstablished, ActiveRecord::NoDatabaseError => e text_response("Database unavailable: #{(e.)}\n\n**Troubleshooting:**\n- Check `config/database.yml` for correct host/port/credentials\n- Try `RAILS_ENV=test` if the development DB is remote\n- Run `bin/rails db:create` if the database doesn't exist yet") rescue ActiveRecord::StatementInvalid => e if e..match?(/timeout|statement_timeout|MAX_EXECUTION_TIME/i) text_response("Query exceeded #{config.query_timeout} second timeout. Simplify the query or add indexes.") elsif e..match?(/could not find|does not exist|Unknown database/i) text_response("Database not found: #{(e.)}\n\n**Troubleshooting:**\n- Run `bin/rails db:create` to create the database\n- Check `config/database.yml` for the correct database name\n- Try `RAILS_ENV=test` if the development DB is remote") else text_response("SQL error: #{(e.)}") end rescue => e text_response("Query failed: #{(e.)}") end |
.references_sensitive_column?(cleaned_sql) ⇒ Boolean
Returns the first sensitive column name referenced by the SQL, or nil. Checks both the user’s configured redacted columns (config.query_redacted_columns) AND the hard-coded suffix list (SENSITIVE_COLUMN_SUFFIXES). The match is case-insensitive and word-bounded so unrelated identifiers containing a sensitive substring are not false-positives.
242 243 244 245 246 247 248 249 250 251 252 |
# File 'lib/rails_ai_context/tools/query.rb', line 242 def self.references_sensitive_column?(cleaned_sql) down = cleaned_sql.downcase # Build the combined list ONCE per call and dedupe. configured = Array(config.query_redacted_columns).map { |c| c.to_s.downcase } suffixed = SENSITIVE_COLUMN_SUFFIXES.map(&:downcase) (configured + suffixed).uniq.each do |col| next if col.empty? return col if down.match?(/\b#{Regexp.escape(col)}\b/) end nil end |
.strip_sql_comments(sql) ⇒ Object
── SQL comment stripping ───────────────────────────────────────
179 180 181 182 183 184 185 |
# File 'lib/rails_ai_context/tools/query.rb', line 179 def self.strip_sql_comments(sql) sql .gsub(/\/\*.*?\*\//m, " ") # Block comments: /* ... */ .gsub(/--[^\n]*/, " ") # Line comments: -- ... .gsub(/^\s*#[^\n]*/m, " ") # MySQL-style comments: # at line start only .squeeze(" ").strip end |
.validate_sql(sql) ⇒ Object
── SQL validation (Layer 1) ────────────────────────────────────
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 |
# File 'lib/rails_ai_context/tools/query.rb', line 188 def self.validate_sql(sql) return [ false, "SQL query is required." ] if sql.nil? || sql.strip.empty? cleaned = strip_sql_comments(sql) # Check multi-statement and clause patterns first — they provide more # specific error messages than the generic keyword blocker. return [ false, "Blocked: multiple statements (no semicolons)" ] if cleaned.match?(MULTI_STATEMENT) return [ false, "Blocked: FOR UPDATE/SHARE clause" ] if cleaned.match?(BLOCKED_CLAUSES) return [ false, "Blocked: sensitive SHOW command" ] if cleaned.match?(BLOCKED_SHOWS) return [ false, "Blocked: SELECT INTO creates a table" ] if cleaned.match?(SELECT_INTO) return [ false, "Blocked: SELECT INTO OUTFILE / DUMPFILE writes to disk" ] if cleaned.match?(BLOCKED_OUTPUT) # Block database functions that give a filesystem/network primitive — # pg_read_file, lo_import, dblink, LOAD_FILE, load_extension, etc. # These pass SET TRANSACTION READ ONLY but bypass sensitive_patterns. if (m = cleaned.match(BLOCKED_FUNCTIONS)) return [ false, "Blocked: dangerous function #{m[0]} (filesystem/network primitive)" ] end # Check for SQL injection tautology patterns (OR 1=1, UNION SELECT, etc.) tautology = TAUTOLOGY_PATTERNS.find { |p| cleaned.match?(p) } return [ false, "Blocked: SQL injection pattern detected (#{cleaned[tautology]})" ] if tautology # Check blocked keywords before the allowed-prefix fallback so that # INSERT/UPDATE/DELETE/DROP etc. get a specific "Blocked" error # rather than the generic "Only SELECT... allowed" message. if (m = cleaned.match(BLOCKED_KEYWORDS)) return [ false, "Blocked: contains #{m[0]}" ] end return [ false, "Only SELECT, WITH, SHOW, EXPLAIN, DESCRIBE allowed" ] unless cleaned.match?(ALLOWED_PREFIX) # Column-aliasing redaction bypass defense: reject any query that # textually references a sensitive column name. See the comment on # SENSITIVE_COLUMN_SUFFIXES above — post-execution redaction cannot # survive `SELECT password_digest AS x`. if (offending = references_sensitive_column?(cleaned)) return [ false, "Blocked: query references sensitive column `#{offending}`. " \ "Post-execution redaction cannot survive aliases / expressions, so " \ "the entire query is rejected. Remove the reference or subtract " \ "from config.query_redacted_columns in an initializer if this " \ "column is not actually sensitive in your app." ] end [ true, nil ] end |