Module: Rubino::Attachments::Defang
- Defined in:
- lib/rubino/attachments/defang.rb
Overview
Structural prompt-injection defense for inlined untrusted file content. No blocklist of phrases (that arms race is unwinnable); instead we strip the Unicode tricks that let attacker text visually escape our framing – bidi/RTL overrides that reorder what the model reads, zero-width joiners that hide payloads, and control chars that could fake a delimiter. NFKC folds compatibility forms so confusables can’t smuggle past the strip. Pure stdlib (String#unicode_normalize), no gem.
Constant Summary collapse
- BIDI_AND_ZERO_WIDTH =
Bidi controls + zero-width chars + BOM. Built from escapes so the source stays ASCII-clean (no raw invisibles in the repo).
Regexp.union( "", "", "", "", "", # ZWSP/ZWNJ/ZWJ/LRM/RLM "", "", "", "", "", # LRE/RLE/PDF/LRO/RLO "", "", "", "", # LRI/RLI/FSI/PDI "", "" # WJ / BOM ).freeze
Class Method Summary collapse
-
.call(text) ⇒ Object
NFKC-normalize, strip bidi/zero-width, drop C0/C1 control chars except n and t (legitimate in text/code).
- .strip_control(str) ⇒ Object
Class Method Details
.call(text) ⇒ Object
NFKC-normalize, strip bidi/zero-width, drop C0/C1 control chars except n and t (legitimate in text/code). Returns a clean String safe to wrap in the nonce frame.
27 28 29 30 31 32 33 34 35 36 37 |
# File 'lib/rubino/attachments/defang.rb', line 27 def call(text) s = text.to_s s = s.scrub("") unless s.valid_encoding? s = s.unicode_normalize(:nfkc) s = s.gsub(BIDI_AND_ZERO_WIDTH, "") strip_control(s) rescue ArgumentError, Encoding::CompatibilityError # unicode_normalize can choke on pathological input; fall back to a # raw strip so we never inline un-defanged bytes. strip_control(text.to_s.scrub("").gsub(BIDI_AND_ZERO_WIDTH, "")) end |
.strip_control(str) ⇒ Object
39 40 41 42 43 44 |
# File 'lib/rubino/attachments/defang.rb', line 39 def strip_control(str) str.each_char.reject do |c| o = c.ord (o < 0x20 && o != 0x09 && o != 0x0A) || (o >= 0x7F && o <= 0x9F) end.join end |