Module: Parse::Agent::PromptHardening

Extended by:: PromptHardening

Included in:: PromptHardening

Defined in:: lib/parse/agent/prompt_hardening.rb

Overview

Sanitization primitives for prompt-injection hardening (NEW-PROMPT-6). A single home for the transforms applied to data that flows toward an LLM: schema descriptions surfaced by the schema tools, untrusted tool result content, and canary scanning of tool results.

All functions are pure (module_function via extend self) and have no dependency on a live client.

Constant Summary collapse

FIELD_NAME_RE = Identifier shape for LLM-surfaced field names: ASCII letter/underscore start, then up to 127 more identifier chars. NOT the secret-field boundary — it permits a leading underscore; _rperm/_hashed_password are stopped by field_allowlist / validate_keys!, untouched here. This only drops non-identifier names (spaces, punctuation, >128 chars, leading digit) that could carry injection payloads in a field name. The length is an injection-safety cap, not a Parse limit — it is set well above any realistic field name so valid identifiers aren't silently dropped from the schema surfaced to the LLM.

/\A[a-zA-Z_][a-zA-Z0-9_]{0,127}\z/

DESCRIPTION_CAP = Max characters retained from any LLM-surfaced description.

SCHEMA_DESC_OPEN =

"<schema_description>"

SCHEMA_DESC_CLOSE =

"</schema_description>"

CONTROL_CHARS_RE = C0 (0x00-0x1F except \t\n) + DEL + C1 (0x7F-0x9F) + zero-width (200B-200D, 2060, FEFF). Stripped from descriptions so invisible control/format characters can't smuggle instructions past a human reviewer or confuse the model.

/[\u0000-\u0008\u000B-\u001F\u007F-\u009F\u200B-\u200D\u2060\uFEFF]/

Instance Method Summary collapse

#sanitize_description(str) ⇒ String
Scrub control chars, cap length, and wrap a description in markers.
#sanitize_schema_for_llm(schema) ⇒ Hash
Sub-part 1 — sanitize an enriched schema hash before it is serialized toward the LLM.
#scan_for_canaries(text) ⇒ String^?
Sub-part 3 — scan text for any operator-registered canary phrase.
#scrub_marker_injection(content) ⇒ String
Sub-part 2 — neutralize wrapper/marker strings embedded in untrusted content so a stored value cannot impersonate or close the tool-result wrapper.
#valid_field_name?(name) ⇒ Boolean
Whether name is a safe LLM-surfaceable identifier.

Instance Method Details

#sanitize_description(str) ⇒ `String`

Scrub control chars, cap length, and wrap a description in markers. Markers in the RAW text are neutralized FIRST (so a stored </schema_description> can't close the wrapper).

Parameters:

str (String)

Returns:

(String)

# File 'lib/parse/agent/prompt_hardening.rb', line 111

def sanitize_description(str)
  return str unless str.is_a?(String)
  cleaned = scrub_marker_injection(str)
  cleaned = cleaned.gsub(CONTROL_CHARS_RE, "")
  cleaned = cleaned[0, DESCRIPTION_CAP] if cleaned.length > DESCRIPTION_CAP
  "#{SCHEMA_DESC_OPEN}#{cleaned}#{SCHEMA_DESC_CLOSE}"
end

#sanitize_schema_for_llm(schema) ⇒ `Hash`

Sub-part 1 — sanitize an enriched schema hash before it is serialized toward the LLM. Returns a sanitized deep copy (input is not mutated). Drops fields whose names fail FIELD_NAME_RE (with a [Parse::Agent:PROMPT] warning), and scrubs + caps + marker-wraps every description / usage string (class-level, per-field, and enum value descriptions).

Parameters:

schema (Hash)

Returns:

(Hash)

# File 'lib/parse/agent/prompt_hardening.rb', line 48

def sanitize_schema_for_llm(schema)
  return schema unless schema.is_a?(Hash)
  out = deep_dup(schema)
  class_name = out["className"] || out[:className]

  %w[description usage].each do |k|
    out[k] = sanitize_description(out[k]) if out[k].is_a?(String)
  end

  fields = out["fields"] || out[:fields]
  if fields.is_a?(Hash)
    fields.keys.each do |fname|
      unless valid_field_name?(fname)
        fields.delete(fname)
        warn "[Parse::Agent:PROMPT] dropped field #{fname.inspect} on " \
             "#{class_name.inspect}: invalid identifier"
        next
      end
      cfg = fields[fname]
      next unless cfg.is_a?(Hash)
      %w[description usage].each do |k|
        cfg[k] = sanitize_description(cfg[k]) if cfg[k].is_a?(String)
      end
      allowed = cfg["allowed_values"] || cfg[:allowed_values]
      if allowed.is_a?(Array)
        allowed.each do |v|
          next unless v.is_a?(Hash)
          v["description"] = sanitize_description(v["description"]) if v["description"].is_a?(String)
          v[:description]  = sanitize_description(v[:description])  if v[:description].is_a?(String)
        end
      end
    end
  end

  # agent_methods entries are surfaced to the LLM by format_schema exactly
  # like field descriptions, and their :description / per-parameter
  # description strings come from the same developer-authored DSL — so they
  # get the same marker-neutralization / control-char strip / length cap.
  # (format_methods emits symbol-keyed hashes; tolerate both forms.)
  methods = out["agent_methods"] || out[:agent_methods]
  if methods.is_a?(Array)
    methods.each do |m|
      next unless m.is_a?(Hash)
      m["description"] = sanitize_description(m["description"]) if m["description"].is_a?(String)
      m[:description]  = sanitize_description(m[:description])  if m[:description].is_a?(String)
      sanitize_nested_descriptions!(m["parameters"] || m[:parameters])
    end
  end

  out
end

#scan_for_canaries(text) ⇒ `String`^?

Sub-part 3 — scan text for any operator-registered canary phrase.

Parameters:

text (String)

Returns:

(String, nil) —
the matched phrase/pattern source, or nil.

# File 'lib/parse/agent/prompt_hardening.rb', line 147

def scan_for_canaries(text)
  canaries = Parse::Agent.prompt_injection_canaries
  return nil if canaries.nil? || canaries.empty?
  s = text.to_s
  return nil if s.empty?
  down = s.downcase
  canaries.each do |c|
    case c
    when Regexp
      return c.source if c.match?(s)
    else
      phrase = c.to_s
      return phrase if !phrase.empty? && down.include?(phrase.downcase)
    end
  end
  nil
end

#scrub_marker_injection(content) ⇒ `String`

Sub-part 2 — neutralize wrapper/marker strings embedded in untrusted content so a stored value cannot impersonate or close the tool-result wrapper. Idempotent: the escaped form no longer contains the original literal, so re-application is a no-op (content is re-serialized into history every turn).

When Parse::Agent.prompt_marker_strict is true, raises instead of escaping (fail-closed for high-assurance deployments).

Parameters:

content (String, #to_s)

Returns:

(String)

# File 'lib/parse/agent/prompt_hardening.rb', line 130

def scrub_marker_injection(content)
  s = content.to_s
  strict = Parse::Agent.prompt_marker_strict
  injection_markers.each do |marker|
    next unless s.include?(marker)
    if strict
      raise Parse::Agent::SecurityError,
            "prompt_marker_strict: untrusted content contains a reserved marker"
    end
    s = s.gsub(marker, escape_marker(marker))
  end
  s
end

#valid_field_name?(name) ⇒ `Boolean`

Returns whether name is a safe LLM-surfaceable identifier.

Returns:

(Boolean) —
whether name is a safe LLM-surfaceable identifier.



101
102
103

# File 'lib/parse/agent/prompt_hardening.rb', line 101

def valid_field_name?(name)
  FIELD_NAME_RE.match?(name.to_s)
end

Module: Parse::Agent::PromptHardening

Overview

Constant Summary collapse

Instance Method Summary collapse

Instance Method Details

#sanitize_description(str) ⇒ String

#sanitize_schema_for_llm(schema) ⇒ Hash

#scan_for_canaries(text) ⇒ String?

#scrub_marker_injection(content) ⇒ String

#valid_field_name?(name) ⇒ Boolean

#sanitize_description(str) ⇒ `String`

#sanitize_schema_for_llm(schema) ⇒ `Hash`

#scan_for_canaries(text) ⇒ `String`^?

#scrub_marker_injection(content) ⇒ `String`

#valid_field_name?(name) ⇒ `Boolean`