Module: Pikuri::Sanitizer
- Defined in:
- lib/pikuri/sanitizer.rb
Overview
Renders attacker-controlled text safe to display, and reports why it was unsafe.
Every string an LLM composes is untrusted: a bash command, a tool observation echoed back to the user, a description it wrote for a confirmation prompt. A model that is broken — or, far more likely, being driven by a prompt injection — can embed bytes that a terminal acts on rather than prints: a carriage return that overwrites the line the user just read, an ESC that recolors or repositions, a backspace that erases, a bidirectional override that reorders text so it reads differently than it runs, a zero-width character that hides in plain sight, or a Cyrillic а masquerading as a Latin a. The whole point of a confirmation prompt collapses if the bytes the user approves are not the bytes that execute.
Sanitizer.sanitize is the one chrome-independent primitive every renderer (terminal, TUI, web) routes through. It does two things and returns both as a Result:
-
Neutralize — make the dangerous bytes visible without changing structure. Control bytes become
\xNN, bidi/zero-width codepoints become u{NNNN}, tab becomes\t. Newlines are preserved (multi-line commands are normal). This is *faithful, not beautifying*: it never collapses runs of whitespace or rewrites a tab to a space, because the user must see exactly what they are approving — a Makefile’s leading tab stays visibly a tab. A web chrome composes html_escape(sanitize(s).text); the HTML layer is the caller’s, not ours. -
Warn — return a Warning per category detected, each a semantic record (kind + offending tokens + a plain-English explanation). Presentation is the chrome’s: a terminal renders these bold yellow, a web client a banner. The Warning carries no color or markup.
Scope (deliberately closed)
Detection covers the *invisibility / cursor-control / reordering* attack classes completely, because each is a finite, enumerable set of codepoints: C0 controls, C1 controls (a second ANSI introducer on some emulators), DEL, the bidi overrides, and the zero-width characters. On top of that, Sanitizer.sanitize flags *mixed-script tokens* —a single word combining letters from Latin + Cyrillic + Greek, which is the signature of a homoglyph spoof and has near-zero false positives on real text (humans do not weld two alphabets inside one word; café is all-Latin, Москва all-Cyrillic, only Pаypal mixes).
Two confusable classes are explicitly *out of scope*, because detecting them needs Unicode confusables tables and produces heavy false positives on legitimate multilingual text:
-
Whole-script homoglyphs — an entirely-Cyrillic string that merely looks Latin (no mixing to detect).
-
Single-symbol confusables — the Greek question mark ; (U+037E) that looks like a semicolon, full-width forms, the division slash.
“Solid” here means complete on the classes above, not exhaustive over all of Unicode.
Defined Under Namespace
Constant Summary collapse
- BIDI_OVERRIDES =
Bidirectional-override codepoints: the explicit LRO/RLO/PDF/LRE/RLE set plus the isolate set (LRI/RLI/FSI/PDI). Reordering attacks.
[*0x202a..0x202e, *0x2066..0x2069].freeze
- ZERO_WIDTH =
Zero-width and invisible codepoints: ZWSP, ZWNJ, ZWJ, and the BOM / zero-width no-break space.
[0x200b, 0x200c, 0x200d, 0xfeff].freeze
- SUSPECT =
Codepoints sanitize rewrites: C0 controls including tab (U+0009) but excluding newline (U+000A, which passes through untouched), C1 controls + DEL (U+007F–009F), the zero-width set, and the bidi overrides. Newline is the one control character a faithful render must keep, so the C0 range is split around it.
/[\u0000-\u0009\u000b-\u001f\u007f-\u009f\u200b-\u200d\u202a-\u202e\u2066-\u2069\ufeff]/- CONFUSABLE_SCRIPTS =
The three Latin-confusable scripts whose mixing inside one token signals a homoglyph spoof. Punctuation, digits and spaces are the
Commonscript and match none of these, so they never count toward the “two distinct scripts” threshold. { 'Latin' => /\p{Latin}/, 'Cyrillic' => /\p{Cyrillic}/, 'Greek' => /\p{Greek}/ }.freeze
Class Method Summary collapse
-
.mixed_script_tokens(text) ⇒ Array<String>
Tokens (whitespace-delimited runs) that combine letters from two or more of CONFUSABLE_SCRIPTS — the homoglyph-spoof signature.
-
.sanitize(text) ⇒ Result
Neutralize
textfor literal display and report what was flagged.
Class Method Details
.mixed_script_tokens(text) ⇒ Array<String>
Tokens (whitespace-delimited runs) that combine letters from two or more of CONFUSABLE_SCRIPTS — the homoglyph-spoof signature.
137 138 139 140 141 |
# File 'lib/pikuri/sanitizer.rb', line 137 def self.mixed_script_tokens(text) text.split(/\s+/).reject(&:empty?).select do |token| CONFUSABLE_SCRIPTS.count { |_name, re| token.match?(re) } >= 2 end.uniq end |
.sanitize(text) ⇒ Result
Neutralize text for literal display and report what was flagged.
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/pikuri/sanitizer.rb', line 107 def self.sanitize(text) backspace = false control = [] bidi = [] zero_width = [] clean = text.gsub(SUSPECT) do |ch| cp = ch.ord if cp == 0x09 '\\t' elsif cp == 0x08 backspace = true '\\x08' elsif BIDI_OVERRIDES.include?(cp) format('\\u{%04x}', cp).tap { |t| bidi << t } elsif ZERO_WIDTH.include?(cp) format('\\u{%04x}', cp).tap { |t| zero_width << t } else format('\\x%02x', cp).tap { |t| control << t } end end Result.new(text: clean, warnings: warnings_for(backspace, control, bidi, zero_width, mixed_script_tokens(text))) end |