Module: Pikuri::Sanitizer

Defined in:: lib/pikuri/sanitizer.rb

Overview

Renders attacker-controlled text safe to display, and reports why it was unsafe.

Every string an LLM composes is untrusted: a bash command, a tool observation echoed back to the user, a description it wrote for a confirmation prompt. A model that is broken — or, far more likely, being driven by a prompt injection — can embed bytes that a terminal acts on rather than prints: a carriage return that overwrites the line the user just read, an ESC that recolors or repositions, a backspace that erases, a bidirectional override that reorders text so it reads differently than it runs, a zero-width character that hides in plain sight, or a Cyrillic а masquerading as a Latin a. The whole point of a confirmation prompt collapses if the bytes the user approves are not the bytes that execute.

Sanitizer.sanitize is the one chrome-independent primitive every renderer (terminal, TUI, web) routes through. It does two things and returns both as a Result:

Neutralize — make the dangerous bytes visible without changing structure. Control bytes become \xNN, bidi/zero-width codepoints become u{NNNN}, tab becomes \t. Newlines are preserved (multi-line commands are normal). This is *faithful, not beautifying*: it never collapses runs of whitespace or rewrites a tab to a space, because the user must see exactly what they are approving — a Makefile’s leading tab stays visibly a tab. A web chrome composes html_escape(sanitize(s).text); the HTML layer is the caller’s, not ours.
Warn — return a Warning per category detected, each a semantic record (kind + offending tokens + a plain-English explanation). Presentation is the chrome’s: a terminal renders these bold yellow, a web client a banner. The Warning carries no color or markup.

Scope (deliberately closed)

Detection covers the *invisibility / cursor-control / reordering* attack classes completely, because each is a finite, enumerable set of codepoints: C0 controls, C1 controls (a second ANSI introducer on some emulators), DEL, the bidi overrides, and the zero-width characters. On top of that, Sanitizer.sanitize flags *mixed-script tokens* —a single word combining letters from Latin + Cyrillic + Greek, which is the signature of a homoglyph spoof and has near-zero false positives on real text (humans do not weld two alphabets inside one word; café is all-Latin, Москва all-Cyrillic, only Pаypal mixes).

Two confusable classes are explicitly *out of scope*, because detecting them needs Unicode confusables tables and produces heavy false positives on legitimate multilingual text:

Whole-script homoglyphs — an entirely-Cyrillic string that merely looks Latin (no mixing to detect).
Single-symbol confusables — the Greek question mark ; (U+037E) that looks like a semicolon, full-width forms, the division slash.

“Solid” here means complete on the classes above, not exhaustive over all of Unicode.

Defined Under Namespace

Classes: Result, Warning

Constant Summary collapse

BIDI_OVERRIDES = Bidirectional-override codepoints: the explicit LRO/RLO/PDF/LRE/RLE set plus the isolate set (LRI/RLI/FSI/PDI). Reordering attacks.

[*0x202a..0x202e, *0x2066..0x2069].freeze

ZERO_WIDTH = Zero-width and invisible codepoints: ZWSP, ZWNJ, ZWJ, and the BOM / zero-width no-break space.

[0x200b, 0x200c, 0x200d, 0xfeff].freeze

SUSPECT =

Codepoints sanitize rewrites: C0 controls including tab (U+0009) but excluding newline (U+000A, which passes through untouched), C1 controls + DEL (U+007F–009F), the zero-width set, and the bidi overrides. Newline is the one control character a faithful render must keep, so the C0 range is split around it.

/[\u0000-\u0009\u000b-\u001f\u007f-\u009f\u200b-\u200d\u202a-\u202e\u2066-\u2069\ufeff]/

CONFUSABLE_SCRIPTS = The three Latin-confusable scripts whose mixing inside one token signals a homoglyph spoof. Punctuation, digits and spaces are the Common script and match none of these, so they never count toward the “two distinct scripts” threshold.

{ 'Latin' => /\p{Latin}/, 'Cyrillic' => /\p{Cyrillic}/, 'Greek' => /\p{Greek}/ }.freeze

Class Method Summary collapse

.mixed_script_tokens(text) ⇒ Array<String>

Tokens (whitespace-delimited runs) that combine letters from two or more of CONFUSABLE_SCRIPTS — the homoglyph-spoof signature.
.sanitize(text) ⇒ Result

Neutralize text for literal display and report what was flagged.

Class Method Details

.mixed_script_tokens(text) ⇒ `Array<String>`

Tokens (whitespace-delimited runs) that combine letters from two or more of CONFUSABLE_SCRIPTS — the homoglyph-spoof signature.

Parameters:

text (String)

Returns:

(Array<String>) —

distinct offending tokens, first-seen order

# File 'lib/pikuri/sanitizer.rb', line 137

def self.mixed_script_tokens(text)
  text.split(/\s+/).reject(&:empty?).select do |token|
    CONFUSABLE_SCRIPTS.count { |_name, re| token.match?(re) } >= 2
  end.uniq
end

.sanitize(text) ⇒ `Result`

Neutralize text for literal display and report what was flagged.