Module: Canon::DiffFormatter::DiffDetailFormatterHelpers::TextUtils
- Defined in:
- lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb
Overview
Text utility methods for diff formatting
Provides helper methods for text manipulation and visualization.
Class Method Summary collapse
-
.ambiguous_text_pair?(text1, text2) ⇒ Boolean
Whether two text values would be visually indistinguishable when rendered through the standard JSON-quoting path.
-
.escape_for_display(text) ⇒ String
Escape non-ASCII and non-printable characters for display.
-
.extract_content_preview(node, max_length = 50) ⇒ String
Extract a content preview from a node.
-
.needs_escaping?(text) ⇒ Boolean
Check if text contains non-ASCII or non-printable characters.
-
.truncate_text(text, max_length) ⇒ String
Truncate text to a maximum length with ellipsis.
-
.visualize_whitespace(text) ⇒ String
Visualize whitespace characters in text.
Class Method Details
.ambiguous_text_pair?(text1, text2) ⇒ Boolean
Whether two text values would be visually indistinguishable when rendered through the standard JSON-quoting path.
Covers three cases that collapse to near-identical short strings like “” / “ ” / “:” / “:”:
* both sides empty
* both sides whitespace-only (possibly with different whitespace
that JSON.generate preserves verbatim but a reader cannot tell
apart from plain spaces)
* both sides equal (the comparator reported a diff based on
something the text-only extraction does not surface — e.g. a
sibling text node that exists on one side and not the other)
Callers should fall back to rendering parent-element context instead.
105 106 107 108 109 110 111 112 113 |
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 105 def self.ambiguous_text_pair?(text1, text2) blank_or_whitespace = ->(t) { t.nil? || t.empty? || t.match?(/\A\s+\z/) } return true if blank_or_whitespace.call(text1) && blank_or_whitespace.call(text2) text1 == text2 end |
.escape_for_display(text) ⇒ String
Escape non-ASCII and non-printable characters for display
Converts characters outside the printable ASCII range (32-126) to their uXXXX escape sequences. This ensures special characters like non-breaking space (u00A0) and em-dash (u2014) are visible in terminal output.
72 73 74 75 76 77 78 79 80 81 82 83 84 |
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 72 def self.escape_for_display(text) return "" if text.nil? text.chars.map do |c| codepoint = c.ord if codepoint < 32 || codepoint >= 127 || codepoint == 34 || codepoint == 92 # Escape control characters, non-ASCII, double-quote, and backslash "\\u#{codepoint.to_s(16).upcase.rjust(4, '0')}" else c end end.join end |
.extract_content_preview(node, max_length = 50) ⇒ String
Extract a content preview from a node
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 45 def self.extract_content_preview(node, max_length = 50) return "" unless node text = if node.respond_to?(:text) node.text elsif node.respond_to?(:content) node.content else node.to_s end return "" if text.nil? || text.empty? # Clean up whitespace text = text.strip.gsub(/\s+/, " ") truncate_text(text, max_length) end |
.needs_escaping?(text) ⇒ Boolean
Check if text contains non-ASCII or non-printable characters
119 120 121 122 123 124 125 126 |
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 119 def self.needs_escaping?(text) return false if text.nil? text.each_char.any? do |c| codepoint = c.ord codepoint < 32 || codepoint >= 127 || codepoint == 34 || codepoint == 92 end end |
.truncate_text(text, max_length) ⇒ String
Truncate text to a maximum length with ellipsis
15 16 17 18 19 |
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 15 def self.truncate_text(text, max_length) return "" if text.nil? text.length > max_length ? "#{text[0...max_length]}..." : text end |
.visualize_whitespace(text) ⇒ String
Visualize whitespace characters in text
Shows spaces as ·, tabs as →, newlines as ¬, and Unicode whitespace like non-breaking space as <NBSP>, etc.
28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 28 def self.visualize_whitespace(text) return "" if text.nil? text .gsub(" ", "·") .gsub("\t", "→") .gsub("\n", "¬") .gsub("\u00A0", "<NBSP>") # Non-breaking space .gsub("\u2028", "<LSEP>") # Line separator .gsub("\u2029", "<PSEP>") # Paragraph separator end |