Class: Canon::TreeDiff::Matchers::HashMatcher

Inherits:
Object
  • Object
show all
Defined in:
lib/canon/tree_diff/matchers/hash_matcher.rb

Overview

HashMatcher performs fast exact subtree matching

Based on XyDiff/Cobena (2002, INRIA) BULD algorithm:

  • Build signature map for tree1

  • Process nodes by weight (heaviest first)

  • Match identical subtrees via signature lookup

  • Propagate matches to ancestors

Complexity: O(n log n) where n is number of nodes

Features:

  • Hash-based exact matching (O(1) lookup)

  • Weight-based prioritization (largest subtrees first)

  • Automatic ancestor propagation

  • Handles both element and text nodes

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(tree1, tree2, options = {}) ⇒ HashMatcher

Initialize matcher with two trees

Parameters:

  • tree1 (TreeNode)

    First tree root

  • tree2 (TreeNode)

    Second tree root

  • options (Hash) (defaults to: {})

    Match options (includes text_content, attribute_order, etc.)



36
37
38
39
40
41
42
43
44
45
46
47
48
# File 'lib/canon/tree_diff/matchers/hash_matcher.rb', line 36

def initialize(tree1, tree2, options = {})
  @tree1 = tree1
  @tree2 = tree2
  @matching = Core::Matching.new
  @signature_map = {}
  @matched_tree1 = Set.new
  @matched_tree2 = Set.new
  @options = options
  @match_options = options # Store full match options for text comparison
  @attribute_comparator = Core::AttributeComparator.new(
    attribute_order: options[:attribute_order] || :ignore,
  )
end

Instance Attribute Details

#match_optionsObject (readonly)

Returns the value of attribute match_options.



29
30
31
# File 'lib/canon/tree_diff/matchers/hash_matcher.rb', line 29

def match_options
  @match_options
end

#matchingObject (readonly)

Returns the value of attribute matching.



29
30
31
# File 'lib/canon/tree_diff/matchers/hash_matcher.rb', line 29

def matching
  @matching
end

#tree1Object (readonly)

Returns the value of attribute tree1.



29
30
31
# File 'lib/canon/tree_diff/matchers/hash_matcher.rb', line 29

def tree1
  @tree1
end

#tree2Object (readonly)

Returns the value of attribute tree2.



29
30
31
# File 'lib/canon/tree_diff/matchers/hash_matcher.rb', line 29

def tree2
  @tree2
end

Instance Method Details

#matchCore::Matching

Perform hash-based matching

Returns:



53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# File 'lib/canon/tree_diff/matchers/hash_matcher.rb', line 53

def match
  build_signature_map

  tree2_nodes = collect_nodes(tree2).sort_by do |node|
    -Core::NodeWeight.for(node).value
  end

  tree2_nodes.each do |node2|
    next if @matched_tree2.include?(node2)

    match_node(node2)
  end

  @matching
end