Class: Woods::Console::CredentialScanner
- Inherits:
-
Object
- Object
- Woods::Console::CredentialScanner
- Defined in:
- lib/woods/console/credential_scanner.rb
Overview
Content-shape credential scanner for Console MCP responses.
Walks a serialized response tree (strings, nested Hash, nested Array) and replaces substrings that match known credential formats with ‘[REDACTED]`. Pattern matching is high-specificity (word-boundary anchored, minimum-length bounded) so false positives against UUIDs, email addresses, and short identifiers stay rare.
This is Layer 2 of the defense-in-depth stack — it runs AFTER the operator-configured column and EAV redaction layers so it catches credentials those layers missed (newly-added EAV keys, secrets stored in JSONB columns, associated records pulled via nested serialization).
Constant Summary collapse
- REDACTED =
rubocop:disable Metrics/ClassLength
'[REDACTED]'- PATTERNS =
High-specificity credential patterns. Each is word-boundary anchored and bounded by a realistic minimum length so random short strings cannot trigger a match.
Order matters: more-specific patterns appear before less-specific alternatives (e.g., ‘anthropic_api_key` before `openai_api_key`) so the specific counter increments rather than the generic one.
{ stripe_secret_key: /\b(?:sk|rk)_(?:live|test)_[A-Za-z0-9]{24,}\b/, stripe_publishable_key: /\bpk_(?:live|test)_[A-Za-z0-9]{24,}\b/, stripe_webhook_secret: /\bwhsec_[A-Za-z0-9]{24,}\b/, # Stripe Connect account IDs are PII per Stripe's ToS even though they # are not strictly secret — surfacing one in an MCP response leaks the # connected merchant's identity. stripe_connect_account_id: /\bacct_[A-Za-z0-9]{16,}\b/, # Klaviyo private API keys use a bare `pk_` prefix with no live/test # infix — they evade the Stripe publishable regex and grant full API # access to the Klaviyo tenant. Order matters: stripe_publishable_key # runs first so its more-specific match wins on Stripe values. klaviyo_private_key: /\bpk_[A-Za-z0-9]{34}\b/, aws_access_key_id: /\b(?:AKIA|ASIA)[0-9A-Z]{16}\b/, github_fine_grained_pat: /\bgithub_pat_[A-Za-z0-9_]{82}\b/, github_token: /\bgh[pousr]_[A-Za-z0-9]{36,}\b/, google_oauth_token: /\bya29\.[A-Za-z0-9_-]{20,}\b/, jwt_token: /\beyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\b/, pem_private_key_block: /-----BEGIN [A-Z ]*PRIVATE KEY-----[\s\S]*?-----END [A-Z ]*PRIVATE KEY-----/, slack_token: /\bxox[abpr]-[A-Za-z0-9-]{10,}\b/, sendgrid_api_key: /\bSG\.[A-Za-z0-9_-]{22}\.[A-Za-z0-9_-]{43}\b/, mailgun_api_key: /\bkey-[a-f0-9]{32}\b/, # Matches both the current `sk-ant-api03-…` / `sk-ant-admin01-…` shape # and the legacy `sk-ant-…` format that shipped without the # `api|admin` infix. Length floor prevents matching on a bare `sk-ant-` # prefix in logs or docs. anthropic_api_key: /\bsk-ant-(?:(?:api|admin)\d{2}-)?[A-Za-z0-9_-]{80,}\b/, openai_api_key: %r{\bsk-(?:proj-)?[A-Za-z0-9/_-]{40,}\b}, # `rt`/`ua` extend the existing alternation to cover refresh tokens # (`shprt_`) and user-access tokens (`shpua_`) — the prefix list # before this PR missed both. shopify_access_token: /\bshp(?:at|ca|ss|pa|rt|ua)_[a-f0-9]{32}\b/, square_access_token: /\bsq0[a-z]{3}-[A-Za-z0-9_-]{22,}\b/, paypal_access_token: /\baccess_token\$(?:production|sandbox)\$[A-Za-z0-9]+\$[a-f0-9]+\b/, # Distinctive `00D<15-org-id>!<base64 payload>` shape — no FP risk # and one of the highest-leverage additions per the research brief. salesforce_access_token: /\b00D[A-Za-z0-9]{12}![A-Za-z0-9._]{80,250}\b/, launchdarkly_sdk_key: /\bsdk-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b/, launchdarkly_mobile_key: /\bmob-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b/, hubspot_private_app_token: Regexp.new( '\bpat-(?:na1|na2|eu1|eu2|ap1)-' \ '[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b' ), brevo_api_key: /\bxkeysib-[a-f0-9]{64}-[A-Za-z0-9]{16}\b/, brevo_smtp_key: /\bxsmtpsib-[a-f0-9]{64}-[A-Za-z0-9]{16}\b/, kit_api_key: /\bkit_[A-Za-z0-9]{20,}\b/, twilio_account_sid: /\bAC[0-9a-fA-F]{32}\b/, twilio_api_key_sid: /\bSK[0-9a-fA-F]{32}\b/, twilio_verify_service_sid: /\bVA[0-9a-fA-F]{32}\b/, # Connection strings with embedded credentials: `postgres://user:pass@host/db`, # `mysql2://user:pass@host/db`, `mongodb://…`, `amqp://…`, `redis://…`. # Captures the entire URL — the password is part of it and redacting # just the password field while leaving `user@host` visible is not # worth the regex complexity when the host may itself be sensitive. database_url_with_password: Regexp.new( '\b(?:postgres|postgresql|mysql|mysql2|mongodb|mongodb\+srv|amqp|amqps|redis|rediss|' \ 'clickhouse|cockroachdb|mariadb)://[^\s:@/]+:[^\s@/]+@\S+' ) }.freeze
- INDEX_HIT =
Counter key emitted when Woods::Console::CredentialIndex substring-matches a value before any shape pattern fires. Distinct from pattern names so observability can tell the two layers apart.
:credential_index
Class Method Summary collapse
-
.patterns ⇒ Array<Symbol>
Every pattern name the scanner knows about.
Instance Method Summary collapse
-
#initialize(disabled_patterns: [], secret_index: nil) ⇒ CredentialScanner
constructor
A new instance of CredentialScanner.
-
#replace_index!(new_index) ⇒ void
Replace the boot-time credential index with a freshly built one.
-
#scan(value) ⇒ Array(Object, Hash{Symbol=>Integer})
Scan a value (String, Hash, Array, or any other object) for credentials.
Constructor Details
#initialize(disabled_patterns: [], secret_index: nil) ⇒ CredentialScanner
Returns a new instance of CredentialScanner.
140 141 142 143 144 |
# File 'lib/woods/console/credential_scanner.rb', line 140 def initialize(disabled_patterns: [], secret_index: nil) disabled = Array(disabled_patterns).to_set(&:to_sym) @active_patterns = PATTERNS.except(*disabled) @secret_index = secret_index unless secret_index.respond_to?(:empty?) && secret_index.empty? end |
Class Method Details
.patterns ⇒ Array<Symbol>
Returns every pattern name the scanner knows about.
104 105 106 |
# File 'lib/woods/console/credential_scanner.rb', line 104 def self.patterns PATTERNS.keys end |
Instance Method Details
#replace_index!(new_index) ⇒ void
This method returns an undefined value.
Replace the boot-time credential index with a freshly built one.
Called by ‘Woods::Console::Server.rebuild_credential_index` after a host app rotates its Rails credentials. Thread-safe: the assignment is atomic on MRI (GVL) and the new index is fully constructed before being swapped in, so in-flight scans see either the old or the new index — never a partial one.
118 119 120 121 122 123 124 |
# File 'lib/woods/console/credential_scanner.rb', line 118 def replace_index!(new_index) @secret_index = if new_index.respond_to?(:empty?) && new_index.empty? nil else new_index end end |
#scan(value) ⇒ Array(Object, Hash{Symbol=>Integer})
Scan a value (String, Hash, Array, or any other object) for credentials.
Strings are gsub’d against every active pattern. Hash values and Array elements are walked recursively; keys and non-string scalars (Integer, Float, true/false, nil) pass through untouched.
156 157 158 159 160 |
# File 'lib/woods/console/credential_scanner.rb', line 156 def scan(value) counts = {} scanned = walk(value, counts) [scanned, counts] end |