Class: Phronomy::Guardrail::Builtin::PromptInjectionDetector
- Inherits:
-
InputGuardrail
- Object
- Phronomy::Guardrail::Base
- InputGuardrail
- Phronomy::Guardrail::Builtin::PromptInjectionDetector
- Defined in:
- lib/phronomy/guardrail/builtin/prompt_injection_detector.rb
Overview
Input guardrail that detects common prompt injection attempts.
Matches a built-in list of injection patterns (case-insensitive) and raises Phronomy::GuardrailError when any pattern is found in the input string. Additional patterns can be supplied via the +additional_patterns:+ argument.
Constant Summary collapse
- DEFAULT_PATTERNS =
Default patterns that signal a prompt injection attempt.
[ /ignore\s+(all\s+)?(previous|prior|above)\s+(instructions?|rules?|prompts?)/i, /disregard\s+(all\s+)?(previous|prior|above)\s+(instructions?|rules?|prompts?)/i, /forget\s+(all\s+)?(previous|prior|above)\s+(instructions?|rules?|prompts?)/i, /\bsystem\s*prompt\s*:/i, /\byou\s+are\s+now\s+(?:a|an)\b/i, /\bact\s+as\s+(?:a|an)\b/i, /\bpretend\s+(?:you\s+are|to\s+be)\b/i, /\bjailbreak\b/i, /\bdan\s*mode\b/i, /\bdev(?:eloper)?\s*mode\b/i ].freeze
Instance Method Summary collapse
- #check(value) ⇒ Object
-
#initialize(additional_patterns: []) ⇒ PromptInjectionDetector
constructor
A new instance of PromptInjectionDetector.
Methods inherited from Phronomy::Guardrail::Base
Constructor Details
#initialize(additional_patterns: []) ⇒ PromptInjectionDetector
Returns a new instance of PromptInjectionDetector.
38 39 40 |
# File 'lib/phronomy/guardrail/builtin/prompt_injection_detector.rb', line 38 def initialize(additional_patterns: []) @patterns = DEFAULT_PATTERNS + Array(additional_patterns) end |
Instance Method Details
#check(value) ⇒ Object
44 45 46 47 48 49 |
# File 'lib/phronomy/guardrail/builtin/prompt_injection_detector.rb', line 44 def check(value) text = value.to_s @patterns.each do |pattern| fail!("Potential prompt injection detected") if text.match?(pattern) end end |