Class: Canon::Diff::FormattingDetector

Inherits:
Object
  • Object
show all
Defined in:
lib/canon/diff/formatting_detector.rb

Overview

Detects if differences between lines are formatting-only (whitespace, line breaks) with no semantic content changes

Constant Summary collapse

CASE_INSENSITIVE_ATTRS =

Attribute names whose values are case-insensitive per XML/XHTML specs. Per the XML specification, the encoding declaration value is case-insensitive (e.g., “UTF-8” equals “utf-8”). The standalone declaration in XML 1.1 is also case-insensitive.

%w[encoding standalone].freeze
QUOTE_CHARS =
["\"", "'"].freeze
SKIP_CHARS =
[" ", "="].freeze

Class Method Summary collapse

Class Method Details

.formatting_block?(old_parts, new_parts) ⇒ Boolean

Detect if a block of consecutive line changes is formatting-only. Joins old and new parts with spaces and compares as a whole. Handles multi-line tag wrapping (e.g., a tag on 2 lines vs 1 line).

Parameters:

  • old_parts (Array<String>)

    Old line contents in the block

  • new_parts (Array<String>)

    New line contents in the block

Returns:

  • (Boolean)

    true if the joined content differs only in formatting



65
66
67
68
69
# File 'lib/canon/diff/formatting_detector.rb', line 65

def self.formatting_block?(old_parts, new_parts)
  return false if old_parts.empty? || new_parts.empty?

  formatting_only?(old_parts.join(" "), new_parts.join(" "))
end

.formatting_only?(line1, line2) ⇒ Boolean

Detect if two lines differ only in formatting

Parameters:

  • line1 (String, nil)

    First line to compare

  • line2 (String, nil)

    Second line to compare

Returns:

  • (Boolean)

    true if lines differ only in formatting



13
14
15
16
17
18
19
20
21
22
# File 'lib/canon/diff/formatting_detector.rb', line 13

def self.formatting_only?(line1, line2)
  # If both are nil or empty, not a formatting diff (no difference)
  return false if blank?(line1) && blank?(line2)

  # If only one is blank, it's not just formatting
  return false if blank?(line1) || blank?(line2)

  # Compare normalized versions
  normalize_for_comparison(line1) == normalize_for_comparison(line2)
end

.formatting_prefix(old_parts, new_parts) ⇒ Hash?

Find the largest formatting-only prefix within old/new parts. Tries all (old_end, new_end) combinations and returns the one with the most old parts. Handles mixed-element blocks where the first element is formatting but later elements are not.

Parameters:

  • old_parts (Array<String>)

    Old line contents

  • new_parts (Array<String>)

    New line contents

Returns:

  • (Hash, nil)

    { old_end:, new_end: } or nil



79
80
81
82
83
84
85
86
87
88
89
90
91
92
# File 'lib/canon/diff/formatting_detector.rb', line 79

def self.formatting_prefix(old_parts, new_parts)
  best = nil

  (1..old_parts.length).each do |old_end|
    (1..new_parts.length).each do |new_end|
      if formatting_only?(old_parts[0...old_end].join(" "),
                          new_parts[0...new_end].join(" "))
        best = { old_end: old_end, new_end: new_end }
      end
    end
  end

  best
end