Module: Clacky::Tools::Terminal::OutputCleaner

Defined in:
lib/clacky/tools/terminal/output_cleaner.rb

Overview

Output cleaning for raw PTY bytes.

A PTY emits whatever the child writes plus terminal control codes. Since the Terminal tool is targeted at LINE-BASED interactive shells (not full-screen TUIs like vim/top), we aggressively strip visual control sequences rather than maintain a screen model.

Cleaning steps (in order):

1. Strip CSI sequences       (ESC[...letter)   — colors, cursor, SGR
2. Strip OSC sequences       (ESC]...BEL/ST)   — window title, etc.
3. Strip simple 2-byte esc   (ESC= / ESC>)     — keypad modes
4. Collapse \r-overwrites    (spinner/progress)
5. Drop backspace erase      (char + \x08)
6. Normalize CRLF → LF

This is lossy for full-screen apps (you’ll see a pile of text without cursor positioning), but for line-based commands it yields clean, diff-friendly output.

Constant Summary collapse

CSI_REGEX =
/\e\[[\d;?]*[a-zA-Z@]/.freeze
OSC_REGEX =
/\e\].*?(\a|\e\\)/m.freeze
SIMPLE_ESC_REGEX =
/\e[=>\(\)].?/.freeze
BACKSPACE_REGEX =
/[^\x08]\x08/.freeze

Class Method Summary collapse

Class Method Details

.clean(raw) ⇒ String

Clean raw PTY bytes for LLM consumption.

Parameters:

  • raw (String)

    raw PTY bytes

Returns:

  • (String)

    cleaned, UTF-8-safe text



35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/clacky/tools/terminal/output_cleaner.rb', line 35

def clean(raw)
  return "" if raw.nil? || raw.empty?

  s = raw.dup
  s.force_encoding(Encoding::UTF_8)
  s = s.scrub("?") unless s.valid_encoding?

  s = s.gsub(CSI_REGEX, "")
  s = s.gsub(OSC_REGEX, "")
  s = s.gsub(SIMPLE_ESC_REGEX, "")

  # Handle \r overwrites within each line. "50%\r100%" → "100%".
  # Split on \n KEEPING the terminators (-1 preserves trailing empty),
  # then for each segment keep only the portion after the last \r
  # (which is what would actually be visible).
  s = s.split("\n", -1).map { |line| line.split("\r").last || "" }.join("\n")

  # Erase "X\b" pairs repeatedly (readline rubout).
  s = s.gsub(BACKSPACE_REGEX, "") while s =~ BACKSPACE_REGEX

  # Normalize any leftover isolated \r.
  s = s.gsub(/\r/, "")

  s
end