Class: Canon::Diff::PathBuilder

Inherits:
Object
  • Object
show all
Defined in:
lib/canon/diff/path_builder.rb

Overview

Builds canonical XPath-like paths from TreeNodes or raw nodes Generates paths with ordinal indices to uniquely identify nodes regardless of the parsing library used (Nokogiri, Moxml, Canon, etc.)

This is library-agnostic because it operates on different node types:

  • TreeNodes (from semantic diff adapters) - uses ‘label` attribute

  • Canon::Xml::Node (from DOM diff) - uses ‘name` attribute

  • Nokogiri nodes (from HTML DOM diff) - uses ‘name` method

Examples:

Build path for a TreeNode

path = PathBuilder.build(tree_node)
# => "/#document-fragment/div[0]/p[1]/span[2]"

Build path for a Canon::Xml::Node

path = PathBuilder.build(canon_node)
# => "/#document/root[0]/body[0]/p[1]"

Build path for a Nokogiri node

path = PathBuilder.build(nokogiri_node)
# => "/#document/div[0]/p[1]/span[2]"

Class Method Summary collapse

Class Method Details

.build(node, format: :fragment) ⇒ String

Build canonical path from a node (TreeNode, Canon::Xml::Node, or Nokogiri)

Parameters:

  • node (Object)

    Node to build path for

  • format (Symbol) (defaults to: :fragment)

    Format (:document or :fragment)

Returns:

  • (String)

    Canonical path with ordinal indices



31
32
33
34
35
36
37
38
39
# File 'lib/canon/diff/path_builder.rb', line 31

def self.build(node, format: :fragment)
  return "" if node.nil?

  # Build path segments from root to node
  segments = build_segments(node)

  # Join segments with /
  "/#{segments.join('/')}"
end

.build_segments(tree_node) ⇒ Array<String>

Build path segments (node names with ordinal indices) Traverses from node up to root, then reverses Handles both TreeNodes and raw nodes (Canon::Xml::Node, Nokogiri)

Parameters:

  • tree_node (Object)

    Node to build segments for

Returns:

  • (Array<String>)

    Path segments from root to node



47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# File 'lib/canon/diff/path_builder.rb', line 47

def self.build_segments(tree_node)
  segments = []
  current = tree_node
  max_depth = 1000 # Prevent infinite loops
  depth = 0

  # Traverse up to root
  while current && depth < max_depth
    segments.unshift(segment_for_node(current))

    # Move to parent if available
    break unless current.respond_to?(:parent)

    current = current.parent
    depth += 1
  end

  segments
end

.human_path(tree_node) ⇒ String

Build human-readable path description Alternative format that may be more useful for error messages Handles both TreeNodes and raw nodes

Parameters:

  • tree_node (Object)

    Node (TreeNode, Canon::Xml::Node, or Nokogiri)

Returns:

  • (String)

    Human-readable path



137
138
139
140
# File 'lib/canon/diff/path_builder.rb', line 137

def self.human_path(tree_node)
  segments = build_segments(tree_node)
  segments.join("")
end

.ordinal_index(tree_node) ⇒ Integer

Get ordinal index of node among its siblings with the same label Handles both TreeNodes (with Array children) and raw nodes (with NodeSet children)

Parameters:

  • tree_node (Object)

    Node (TreeNode, Canon::Xml::Node, or Nokogiri)

Returns:

  • (Integer)

    Zero-based ordinal index



94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
# File 'lib/canon/diff/path_builder.rb', line 94

def self.ordinal_index(tree_node)
  # Defensive: return 0 if no parent or doesn't respond to parent
  return 0 unless tree_node.respond_to?(:parent)
  return 0 unless tree_node.parent

  # Check if parent has children
  return 0 unless tree_node.parent.respond_to?(:children)

  siblings = tree_node.parent.children
  return 0 unless siblings

  # Convert to array if it's a NodeSet (Nokogiri) or similar
  siblings = siblings.to_a unless siblings.is_a?(Array)

  # Get the label/name for comparison
  my_label = if tree_node.respond_to?(:label)
               tree_node.label
             elsif tree_node.respond_to?(:name)
               tree_node.name
             end

  return 0 unless my_label

  # Count siblings with same label that appear before this node
  same_label_siblings = siblings.select do |s|
    sibling_label = if s.respond_to?(:label)
                      s.label
                    elsif s.respond_to?(:name)
                      s.name
                    end
    sibling_label == my_label
  end

  # Find position in same-label siblings
  same_label_siblings.index(tree_node) || 0
end

.segment_for_node(tree_node) ⇒ String

Build path segment for a single node Returns label with ordinal index: “div”, “span”, etc. Handles both TreeNodes (with label) and raw nodes (with name)

Parameters:

  • tree_node (Object)

    Node (TreeNode, Canon::Xml::Node, or Nokogiri)

Returns:

  • (String)

    Path segment with ordinal index



73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/canon/diff/path_builder.rb', line 73

def self.segment_for_node(tree_node)
  # Handle both TreeNodes (with label) and raw nodes (with name)
  label = if tree_node.respond_to?(:label)
            tree_node.label
          elsif tree_node.respond_to?(:name)
            tree_node.name
          else
            "unknown"
          end

  # Get ordinal index (position among siblings with same label)
  index = ordinal_index(tree_node)

  "#{label}[#{index}]"
end