Class: Parse::Middleware::BodyBuilder

Inherits:
Faraday::Middleware
  • Object
show all
Includes:
Protocol
Defined in:
lib/parse/client/body_builder.rb

Overview

This middleware takes an incoming Parse response, after an outgoing request, and creates a Parse::Response object.

Constant Summary collapse

HTTP_METHOD_OVERRIDE =

Header sent when a GET requests exceeds the limit.

"X-Http-Method-Override"
MAX_URL_LENGTH =

Maximum url length for most server requests before HTTP Method Override is used.

2_000.freeze
SENSITIVE_FIELDS =

Fields that should be redacted from log output.

%w[
  password token sessionToken session_token access_token authData
  masterKey master_key apiKey api_key clientKey client_key
  javascriptKey javascript_key refreshToken refresh_token
].freeze
SENSITIVE_PATTERN =
/(#{SENSITIVE_FIELDS.join("|")})(["']?\s*[=:>]\s*["']?)([^"&\s,}\]]+)/i
SENSITIVE_FIELDS_SET =

Lookup set of sensitive field names for structural (JSON) redaction — case-insensitive match on the key, not the value. Walks the parsed structure so nested objects like “password”:{“nested”:“value”} and escaped-quote payloads (which the regex misses) are scrubbed.

SENSITIVE_FIELDS.map(&:downcase).to_set.freeze
REDACTED_PLACEHOLDER =

Placeholder used in place of redacted values.

"[FILTERED]"
LOG_VECTOR_COMPACT_THRESHOLD =

Minimum length at which a numeric-only Array in a logged JSON body is compacted to a single placeholder string instead of printed verbatim. Two concerns drive this:

  1. Noise. A 1536-float OpenAI embedding inlines as ~25 KB of JSON per logged row. Aggregation pipelines with ‘$vectorSearch.queryVector` and any save/fetch carrying a `:vector` field would otherwise drown operator logs.

  2. Sensitivity. Embeddings are reversible-by-similarity: an attacker who scrapes operator logs can reconstruct high-level features of the source text (topic, sentiment, sometimes near-verbatim phrases for short inputs) by nearest-neighbor lookup against a public model.

Threshold rationale: 32 is well below every common embedding width (BGE-small 384, Cohere 1024, OpenAI small 1536, OpenAI large 3072) and well above any normal Parse Array property (tags, role lists, etc.). Numeric-only check additionally protects normal long arrays of strings/objects.

32
REDACTED_HEADERS =

Request headers that must never be printed verbatim in debug logs. Matched case-insensitively against Faraday header keys.

[
  Parse::Protocol::MASTER_KEY,
  Parse::Protocol::API_KEY,
  Parse::Protocol::SESSION_TOKEN,
  "X-Parse-JavaScript-Key",
  "Authorization",
  "Cookie",
  # Embedding-provider credentials (Parse::Embeddings::OpenAI and
  # forthcoming Cohere/Voyage adapters). These never touch Parse
  # Server itself, but they share the same Faraday log path when a
  # caller mounts the embeddings connection through Parse logging.
  # OpenAI's official auth header is `Authorization: Bearer …`
  # (already covered above); Organization/Project are listed here
  # since they're account-identifying metadata operators may not
  # want to publish. `X-Api-Key` and `Anthropic-Api-Key` are
  # reserved for forthcoming non-OpenAI providers.
  "X-Api-Key",
  "OpenAI-Organization",
  "OpenAI-Project",
  "Anthropic-Api-Key",
  # Cohere, Voyage, Jina, and DashScope (Qwen) use Bearer auth
  # (covered by "Authorization" above), but some operators front
  # them with a proxy that rewrites to a vendor-specific header.
  # These are listed defensively so a future header-form switch
  # doesn't silently leak keys into Faraday logs. `Api-Key` is the
  # bare form some vendor SDKs and proxies use; covered for parity.
  "Cohere-Api-Key",
  "Voyage-Api-Key",
  "Jina-Api-Key",
  "Api-Key",
  "X-DashScope-Api-Key",
  "DashScope-Api-Key",
].map(&:downcase).freeze

Constants included from Protocol

Protocol::API_KEY, Protocol::APP_ID, Protocol::CONTENT_TYPE, Protocol::CONTENT_TYPE_FORMAT, Protocol::EMAIL, Protocol::INSTALLATION_ID, Protocol::MASTER_KEY, Protocol::PASSWORD, Protocol::READ_PREFERENCE, Protocol::READ_PREFERENCES, Protocol::REVOCABLE_SESSION, Protocol::SERVER_URL, Protocol::SESSION_TOKEN

Class Attribute Summary collapse

Class Method Summary collapse

Class Attribute Details

.loggingBoolean

Allows logging. Set to ‘true` to enable logging, `false` to disable. You may specify `:debug` for additional verbosity.

Returns:

  • (Boolean)


111
112
113
# File 'lib/parse/client/body_builder.rb', line 111

def logging
  @logging
end

Class Method Details

.redact(str) ⇒ String

Redacts sensitive fields from a string for safe logging.

Two passes run in sequence so that no payload shape leaks secrets:

  1. **Structural pass.** If the body (after whitespace trim) parses as JSON, the parsed structure is walked recursively. Any value whose key matches SENSITIVE_FIELDS_SET (case-insensitive) is replaced. String values that themselves look like JSON are recursively parsed and scrubbed — catches {“body”:“{"password":"x"}”} payloads.

  2. **Regex pass.** The result of the structural pass (or the original string if parsing failed) is always also run through the SENSITIVE_PATTERN regex as defense-in-depth. This catches form- encoded bodies, partial JSON, escaped-quote payloads, and string array elements like [“password=hunter2”] that the structural walker can’t redact in-place.

Parameters:

  • str (String)

    the string to redact.

Returns:

  • (String)

    the redacted string.



133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
# File 'lib/parse/client/body_builder.rb', line 133

def self.redact(str)
  s = str.to_s
  return s if s.empty?
  after_structural = s
  if (parsed = try_parse_json(s))
    scrubbed = scrub_sensitive!(parsed)
    compact_vectors!(scrubbed)
    begin
      after_structural = scrubbed.to_json
    rescue StandardError
      after_structural = s
    end
  end
  after_structural.gsub(SENSITIVE_PATTERN) do
    key_part = $1
    sep_part = $2
    val_part = $3
    # Skip values that the structural pass already redacted —
    # otherwise the regex value-class +[^"&\s,}\]]+ stops at the
    # bracket and we end up with +[FILTERED]]+ from the trailing
    # close-bracket left over from +"[FILTERED]"+.
    if val_part == "[FILTERED" || val_part == REDACTED_PLACEHOLDER
      "#{key_part}#{sep_part}#{val_part}"
    else
      "#{key_part}#{sep_part}#{REDACTED_PLACEHOLDER}"
    end
  end
end