Class: Canon::Diff::PathBuilder

Inherits:
Object
  • Object
show all
Defined in:
lib/canon/diff/path_builder.rb

Overview

Builds canonical XPath-like paths from TreeNodes or raw nodes Generates paths with ordinal indices to uniquely identify nodes regardless of the parsing library used (Nokogiri, Moxml, Canon, etc.)

This is library-agnostic because it operates on different node types:

  • TreeNodes (from semantic diff adapters) - uses ‘label` attribute

  • Canon::Xml::Node (from DOM diff) - uses ‘name` attribute

  • Nokogiri nodes (from HTML DOM diff) - uses ‘name` method

Examples:

Build path for a TreeNode

path = PathBuilder.build(tree_node)
# => "/#document-fragment/div[0]/p[1]/span[2]"

Build path for a Canon::Xml::Node

path = PathBuilder.build(canon_node)
# => "/#document/root[0]/body[0]/p[1]"

Build path for a Nokogiri node

path = PathBuilder.build(nokogiri_node)
# => "/#document/div[0]/p[1]/span[2]"

Class Method Summary collapse

Class Method Details

.build(node, format: :fragment) ⇒ String

Build canonical path from a node (TreeNode, Canon::Xml::Node, or Nokogiri)

Parameters:

  • node (Object)

    Node to build path for

  • format (Symbol) (defaults to: :fragment)

    Format (:document or :fragment)

Returns:

  • (String)

    Canonical path with ordinal indices



31
32
33
34
35
36
37
38
39
# File 'lib/canon/diff/path_builder.rb', line 31

def self.build(node, format: :fragment)
  return "" if node.nil?

  # Build path segments from root to node
  segments = build_segments(node)

  # Join segments with /
  "/#{segments.join('/')}"
end

.build_segments(tree_node) ⇒ Array<String>

Build path segments (node names with ordinal indices) Traverses from node up to root, then reverses Handles both TreeNodes and raw nodes (Canon::Xml::Node, Nokogiri)

Parameters:

  • tree_node (Object)

    Node to build segments for

Returns:

  • (Array<String>)

    Path segments from root to node



47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# File 'lib/canon/diff/path_builder.rb', line 47

def self.build_segments(tree_node)
  segments = []
  current = tree_node
  max_depth = 1000 # Prevent infinite loops
  depth = 0

  # Traverse up to root
  while current && depth < max_depth
    segments.unshift(segment_for_node(current))

    parent = node_parent(current)
    break unless parent

    current = parent
    depth += 1
  end

  segments
end

.human_path(tree_node) ⇒ String

Build human-readable path description Alternative format that may be more useful for error messages Handles both TreeNodes and raw nodes

Parameters:

  • tree_node (Object)

    Node (TreeNode, Canon::Xml::Node, or Nokogiri)

Returns:

  • (String)

    Human-readable path



126
127
128
129
# File 'lib/canon/diff/path_builder.rb', line 126

def self.human_path(tree_node)
  segments = build_segments(tree_node)
  segments.join("")
end

.node_children(node) ⇒ Object



139
140
141
# File 'lib/canon/diff/path_builder.rb', line 139

def self.node_children(node)
  Canon::Comparison::NodeInspector.children(node)
end

.node_label(node) ⇒ Object



131
132
133
# File 'lib/canon/diff/path_builder.rb', line 131

def self.node_label(node)
  Canon::Comparison::NodeInspector.name(node) || "unknown"
end

.node_parent(node) ⇒ Object



135
136
137
# File 'lib/canon/diff/path_builder.rb', line 135

def self.node_parent(node)
  Canon::Comparison::NodeInspector.parent(node)
end

.ordinal_index(tree_node) ⇒ Integer

Get ordinal index of node among its siblings with the same label Handles both TreeNodes (with Array children) and raw nodes (with NodeSet children)

Parameters:

  • tree_node (Object)

    Node (TreeNode, Canon::Xml::Node, or Nokogiri)

Returns:

  • (Integer)

    Zero-based ordinal index



97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
# File 'lib/canon/diff/path_builder.rb', line 97

def self.ordinal_index(tree_node)
  parent = node_parent(tree_node)
  return 0 unless parent

  siblings = node_children(parent)
  return 0 unless siblings

  # Convert to array if it's a NodeSet (Nokogiri) or similar
  siblings = siblings.to_a unless siblings.is_a?(Array)

  my_label = node_label(tree_node)
  return 0 unless my_label

  # Count siblings with same label that appear before this node
  same_label_siblings = siblings.select do |s|
    sibling_label = node_label(s)
    sibling_label == my_label
  end

  # Find position in same-label siblings
  same_label_siblings.index(tree_node) || 0
end

.segment_for_node(tree_node) ⇒ String

Build path segment for a single node Returns label with ordinal index: “div”, “span”, etc. Handles both TreeNodes (with label) and raw nodes (with name)

Parameters:

  • tree_node (Object)

    Node (TreeNode, Canon::Xml::Node, or Nokogiri)

Returns:

  • (String)

    Path segment with ordinal index



73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# File 'lib/canon/diff/path_builder.rb', line 73

def self.segment_for_node(tree_node)
  label = node_label(tree_node)

  # Get ordinal index (position among siblings with same label)
  index = ordinal_index(tree_node)

  # For text nodes, use parent element name for clarity
  # e.g., instead of "/p/#text[0]" use "/p/text()[0]"
  parent = node_parent(tree_node)
  if ["text", "#text"].include?(label) && parent
    parent_name = node_label(parent)
    if parent_name && parent_name != "#document" && parent_name != "#document-fragment"
      return "#{parent_name}/text()[#{index}]"
    end
  end

  "#{label}[#{index}]"
end