Module: Canon::Comparison::NodeInspector

Defined in:
lib/canon/comparison/node_inspector.rb

Overview

Single source of truth for cross-backend node type operations.

The comparison pipeline handles nodes from multiple sources:

  • Canon::Xml::Node (+ RootNode, ElementNode, TextNode, etc.) —custom DOM built by SAX builder and DataModel.

  • Canon::TreeDiff::Core::TreeNode — semantic tree diff nodes.

  • Backend-specific nodes (Nokogiri or Moxml) — live parsed nodes.

Architecture: NodeInspector handles Canon-native types (Canon::Xml::Node, TreeNode) directly, then delegates ALL backend-specific queries to XmlParsing. No Moxml/Nokogiri constants are referenced here — that knowledge lives exclusively in XmlParsing.

Class Method Summary collapse

Class Method Details

.attribute_value(node, attr_name) ⇒ Object



130
131
132
133
134
135
136
137
138
139
140
141
# File 'lib/canon/comparison/node_inspector.rb', line 130

def self.attribute_value(node, attr_name)
  return nil unless node

  if node.is_a?(Canon::Xml::Nodes::ElementNode)
    attr = node.attribute_nodes.find { |a| a.name == attr_name.to_s }
    attr&.value
  elsif node.is_a?(Canon::Xml::Node)
    nil
  else
    XmlParsing.attribute_value(node, attr_name)
  end
end

.children(node) ⇒ Object



107
108
109
110
111
112
113
# File 'lib/canon/comparison/node_inspector.rb', line 107

def self.children(node)
  return [] unless node
  return node.children if node.is_a?(Canon::Xml::Node)
  return node.children || [] if node.is_a?(Canon::TreeDiff::Core::TreeNode)

  XmlParsing.children(node)
end

.comment_node?(node) ⇒ Boolean

Returns:

  • (Boolean)


34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# File 'lib/canon/comparison/node_inspector.rb', line 34

def self.comment_node?(node)
  return false unless node
  return node.node_type == :comment if node.is_a?(Canon::Xml::Node)

  if XmlBackend.nokogiri?
    return true if node.is_a?(Nokogiri::XML::Node) && node.comment?

    # HTML comments are parsed as TEXT nodes by Nokogiri
    if node.is_a?(Nokogiri::XML::Node) && node.text?
      text_stripped = text_content(node).to_s.strip.gsub("\\", "")
      return true if text_stripped.start_with?("<!--") && text_stripped.end_with?("-->")
    end
    false
  else
    XmlParsing.comment?(node)
  end
end

.document?(node) ⇒ Boolean

Returns:

  • (Boolean)


52
53
54
55
56
# File 'lib/canon/comparison/node_inspector.rb', line 52

def self.document?(node)
  return node.node_type == :root if node.is_a?(Canon::Xml::Node)

  XmlParsing.document?(node)
end

.document_fragment?(node) ⇒ Boolean

Returns:

  • (Boolean)


58
59
60
61
62
63
# File 'lib/canon/comparison/node_inspector.rb', line 58

def self.document_fragment?(node)
  return false unless node
  return false unless node.is_a?(Canon::Xml::Nodes::RootNode)

  node.fragment?
end

.element_node?(node) ⇒ Boolean

Returns:

  • (Boolean)


27
28
29
30
31
32
# File 'lib/canon/comparison/node_inspector.rb', line 27

def self.element_node?(node)
  return false unless node
  return node.node_type == :element if node.is_a?(Canon::Xml::Node)

  XmlParsing.element?(node)
end

.name(node) ⇒ Object

— Node queries —



91
92
93
94
95
96
97
# File 'lib/canon/comparison/node_inspector.rb', line 91

def self.name(node)
  return nil unless node
  return node.name if node.is_a?(Canon::Xml::Node)
  return node.label if node.is_a?(Canon::TreeDiff::Core::TreeNode)

  XmlParsing.name(node)
end

.namespace_uri(node) ⇒ Object



143
144
145
146
147
148
149
150
151
# File 'lib/canon/comparison/node_inspector.rb', line 143

def self.namespace_uri(node)
  return nil unless node

  if node.is_a?(Canon::Xml::Node)
    node.is_a?(Canon::Xml::Nodes::ElementNode) ? node.namespace_uri : nil
  else
    XmlParsing.namespace_uri(node)
  end
end

.node_type(node) ⇒ Object



122
123
124
125
126
127
128
# File 'lib/canon/comparison/node_inspector.rb', line 122

def self.node_type(node)
  return nil unless node
  return node.node_type if node.is_a?(Canon::Xml::Node)
  return node.type&.to_sym if node.is_a?(Canon::TreeDiff::Core::TreeNode)

  XmlParsing.node_type(node)
end

.noise_dimension_for(node) ⇒ Object

— Noise classification —



77
78
79
80
81
82
83
# File 'lib/canon/comparison/node_inspector.rb', line 77

def self.noise_dimension_for(node)
  if whitespace_only_text?(node)
    :whitespace_adjacency
  elsif comment_node?(node)
    :comments
  end
end

.noise_node?(node) ⇒ Boolean

Returns:

  • (Boolean)


85
86
87
# File 'lib/canon/comparison/node_inspector.rb', line 85

def self.noise_node?(node)
  !noise_dimension_for(node).nil?
end

.parent(node) ⇒ Object



99
100
101
102
103
104
105
# File 'lib/canon/comparison/node_inspector.rb', line 99

def self.parent(node)
  return nil unless node
  return node.parent if node.is_a?(Canon::Xml::Node)
  return node.parent if node.is_a?(Canon::TreeDiff::Core::TreeNode)

  XmlParsing.parent(node)
end

.parse_errors(node) ⇒ Object



153
154
155
156
157
158
159
160
161
162
163
164
165
166
# File 'lib/canon/comparison/node_inspector.rb', line 153

def self.parse_errors(node)
  return [] if node.nil?
  return Array(node.parse_errors).map(&:to_s) if node.is_a?(Canon::Xml::Node)

  if XmlBackend.nokogiri?
    if node.is_a?(Nokogiri::XML::Document) || node.is_a?(Nokogiri::HTML5::Document)
      Array(node.errors).map(&:to_s)
    else
      []
    end
  else
    []
  end
end

.text_content(node) ⇒ Object



115
116
117
118
119
120
# File 'lib/canon/comparison/node_inspector.rb', line 115

def self.text_content(node)
  return node.value.to_s if node.is_a?(Canon::Xml::Nodes::TextNode)
  return node.text_content.to_s if node.is_a?(Canon::Xml::Node)

  XmlParsing.text_content(node).to_s
end

.text_node?(node) ⇒ Boolean

— Type predicates —

Returns:

  • (Boolean)


20
21
22
23
24
25
# File 'lib/canon/comparison/node_inspector.rb', line 20

def self.text_node?(node)
  return false unless node
  return node.node_type == :text if node.is_a?(Canon::Xml::Node)

  XmlParsing.text_node?(node)
end

.whitespace_only_text?(node) ⇒ Boolean

True when node is a text node whose content is whitespace-only. Empty-string text nodes return false — those represent genuine empty-vs-content asymmetry, not pretty-print indentation.

Returns:

  • (Boolean)


68
69
70
71
72
73
# File 'lib/canon/comparison/node_inspector.rb', line 68

def self.whitespace_only_text?(node)
  return false unless text_node?(node)

  text = text_content(node)
  !text.empty? && text.strip.empty?
end