Module: Canon::DiffFormatter::DiffDetailFormatterHelpers::TextUtils

Defined in:
lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb

Overview

Text utility methods for diff formatting

Provides helper methods for text manipulation and visualization.

Class Method Summary collapse

Class Method Details

.ambiguous_text_pair?(text1, text2) ⇒ Boolean

Whether two text values would be visually indistinguishable when rendered through the standard JSON-quoting path.

Covers three cases that collapse to near-identical short strings like “” / “ ” / “:” / “:”:

* both sides empty
* both sides whitespace-only (possibly with different whitespace
  that JSON.generate preserves verbatim but a reader cannot tell
  apart from plain spaces)
* both sides equal (the comparator reported a diff based on
  something the text-only extraction does not surface — e.g. a
  sibling text node that exists on one side and not the other)

Callers should fall back to rendering parent-element context instead.

Parameters:

  • text1 (String, nil)
  • text2 (String, nil)

Returns:

  • (Boolean)


105
106
107
108
109
110
111
112
113
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 105

def self.ambiguous_text_pair?(text1, text2)
  blank_or_whitespace = ->(t) {
    t.nil? || t.empty? || t.match?(/\A\s+\z/)
  }
  return true if blank_or_whitespace.call(text1) &&
    blank_or_whitespace.call(text2)

  text1 == text2
end

.escape_for_display(text) ⇒ String

Escape non-ASCII and non-printable characters for display

Converts characters outside the printable ASCII range (32-126) to their uXXXX escape sequences. This ensures special characters like non-breaking space (u00A0) and em-dash (u2014) are visible in terminal output.

Parameters:

  • text (String)

    Text to escape

Returns:

  • (String)

    Escaped text safe for terminal display



72
73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 72

def self.escape_for_display(text)
  return "" if text.nil?

  text.chars.map do |c|
    codepoint = c.ord
    if codepoint < 32 || codepoint >= 127 || codepoint == 34 || codepoint == 92
      # Escape control characters, non-ASCII, double-quote, and backslash
      "\\u#{codepoint.to_s(16).upcase.rjust(4, '0')}"
    else
      c
    end
  end.join
end

.extract_content_preview(node, max_length = 50) ⇒ String

Extract a content preview from a node

Parameters:

  • node (Object)

    Node to extract from

  • max_length (Integer) (defaults to: 50)

    Maximum length of preview

Returns:

  • (String)

    Content preview



45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 45

def self.extract_content_preview(node, max_length = 50)
  return "" unless node

  text = if node.respond_to?(:text)
           node.text
         elsif node.respond_to?(:content)
           node.content
         else
           node.to_s
         end

  return "" if text.nil? || text.empty?

  # Clean up whitespace
  text = text.strip.gsub(/\s+/, " ")
  truncate_text(text, max_length)
end

.needs_escaping?(text) ⇒ Boolean

Check if text contains non-ASCII or non-printable characters

Parameters:

  • text (String)

    Text to check

Returns:

  • (Boolean)

    true if text needs escaping for display



119
120
121
122
123
124
125
126
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 119

def self.needs_escaping?(text)
  return false if text.nil?

  text.each_char.any? do |c|
    codepoint = c.ord
    codepoint < 32 || codepoint >= 127 || codepoint == 34 || codepoint == 92
  end
end

.truncate_text(text, max_length) ⇒ String

Truncate text to a maximum length with ellipsis

Parameters:

  • text (String)

    Text to truncate

  • max_length (Integer)

    Maximum length

Returns:

  • (String)

    Truncated text



15
16
17
18
19
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 15

def self.truncate_text(text, max_length)
  return "" if text.nil?

  text.length > max_length ? "#{text[0...max_length]}..." : text
end

.visualize_whitespace(text) ⇒ String

Visualize whitespace characters in text

Shows spaces as ·, tabs as →, newlines as ¬, and Unicode whitespace like non-breaking space as <NBSP>, etc.

Parameters:

  • text (String)

    Text to visualize

Returns:

  • (String)

    Text with visible whitespace



28
29
30
31
32
33
34
35
36
37
38
# File 'lib/canon/diff_formatter/diff_detail_formatter/text_utils.rb', line 28

def self.visualize_whitespace(text)
  return "" if text.nil?

  text
    .gsub(" ", "·")
    .gsub("\t", "")
    .gsub("\n", "¬")
    .gsub("\u00A0", "<NBSP>") # Non-breaking space
    .gsub("\u2028", "<LSEP>")    # Line separator
    .gsub("\u2029", "<PSEP>")    # Paragraph separator
end