Module: Canon::Comparison::NodeInspector
- Defined in:
- lib/canon/comparison/node_inspector.rb
Overview
Single source of truth for cross-backend node type operations.
The comparison pipeline handles nodes from multiple sources:
-
Canon::Xml::Node (+ RootNode, ElementNode, TextNode, etc.) —custom DOM built by SAX builder and DataModel.
-
Canon::TreeDiff::Core::TreeNode — semantic tree diff nodes.
-
Backend-specific nodes (Nokogiri or Moxml) — live parsed nodes.
All type dispatch uses backend-branching (‘if XmlBackend.nokogiri?`) rather than `case/when` with constant references. This prevents NameError when Nokogiri constants are undefined under Opal.
Every node query in the codebase should go through this module. Do not create private dispatch methods in consumers.
Constant Summary collapse
- NOKOGIRI_TEXT_TYPE =
defined?(Nokogiri::XML::Node::TEXT_NODE) ? Nokogiri::XML::Node::TEXT_NODE : 3
Class Method Summary collapse
-
.attribute_value(node, attr_name) ⇒ Object
Unified attribute value access.
-
.children(node) ⇒ Object
Unified children access across all node types.
- .comment_node?(node) ⇒ Boolean
- .document?(node) ⇒ Boolean
- .document_fragment?(node) ⇒ Boolean
- .element_node?(node) ⇒ Boolean
-
.name(node) ⇒ Object
Unified node name extraction across all node types.
-
.namespace_uri(node) ⇒ Object
Unified namespace URI access.
-
.node_type(node) ⇒ Object
Unified node type that always returns a symbol.
-
.noise_dimension_for(node) ⇒ Object
— Noise classification —.
- .noise_node?(node) ⇒ Boolean
-
.parent(node) ⇒ Object
Unified parent access across all node types.
-
.parent_of(node) ⇒ Object
Deprecated: use NodeInspector.parent instead.
-
.parse_errors(node) ⇒ Object
Extract parse-time errors carried on a node or its owning document.
-
.text_content(node) ⇒ Object
Extract the text content of
nodeas a String. -
.text_node?(node) ⇒ Boolean
— Type predicates —.
-
.whitespace_only_text?(node) ⇒ Boolean
True when
nodeis a text node whose content is whitespace-only.
Class Method Details
.attribute_value(node, attr_name) ⇒ Object
Unified attribute value access.
158 159 160 161 162 163 164 165 166 167 168 169 |
# File 'lib/canon/comparison/node_inspector.rb', line 158 def self.attribute_value(node, attr_name) return nil unless node if node.is_a?(Canon::Xml::Nodes::ElementNode) attr = node.attribute_nodes.find { |a| a.name == attr_name.to_s } attr&.value elsif node.is_a?(Canon::Xml::Node) nil else XmlParsing.attribute_value(node, attr_name) end end |
.children(node) ⇒ Object
Unified children access across all node types.
122 123 124 125 126 127 128 |
# File 'lib/canon/comparison/node_inspector.rb', line 122 def self.children(node) return [] unless node return node.children if node.is_a?(Canon::Xml::Node) return node.children || [] if node.is_a?(Canon::TreeDiff::Core::TreeNode) XmlParsing.children(node) end |
.comment_node?(node) ⇒ Boolean
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
# File 'lib/canon/comparison/node_inspector.rb', line 46 def self.comment_node?(node) return false unless node return node.node_type == :comment if node.is_a?(Canon::Xml::Node) if XmlBackend.nokogiri? return true if node.is_a?(Nokogiri::XML::Node) && node.comment? # HTML comments are parsed as TEXT nodes by Nokogiri if node.is_a?(Nokogiri::XML::Node) && node.text? text_stripped = text_content(node).to_s.strip.gsub("\\", "") return true if text_stripped.start_with?("<!--") && text_stripped.end_with?("-->") end false else node.is_a?(Moxml::Comment) end end |
.document?(node) ⇒ Boolean
64 65 66 67 68 |
# File 'lib/canon/comparison/node_inspector.rb', line 64 def self.document?(node) return node.node_type == :root if node.is_a?(Canon::Xml::Node) XmlParsing.document?(node) end |
.document_fragment?(node) ⇒ Boolean
70 71 72 73 74 75 |
# File 'lib/canon/comparison/node_inspector.rb', line 70 def self.document_fragment?(node) return false unless node return false unless node.is_a?(Canon::Xml::Nodes::RootNode) node.fragment? end |
.element_node?(node) ⇒ Boolean
35 36 37 38 39 40 41 42 43 44 |
# File 'lib/canon/comparison/node_inspector.rb', line 35 def self.element_node?(node) return false unless node return node.node_type == :element if node.is_a?(Canon::Xml::Node) if XmlBackend.nokogiri? node.is_a?(Nokogiri::XML::Element) || node.is_a?(Moxml::Element) else node.is_a?(Moxml::Element) end end |
.name(node) ⇒ Object
Unified node name extraction across all node types.
104 105 106 107 108 109 110 |
# File 'lib/canon/comparison/node_inspector.rb', line 104 def self.name(node) return nil unless node return node.name if node.is_a?(Canon::Xml::Node) return node.label if node.is_a?(Canon::TreeDiff::Core::TreeNode) XmlParsing.name(node) end |
.namespace_uri(node) ⇒ Object
Unified namespace URI access.
172 173 174 175 176 177 178 179 180 |
# File 'lib/canon/comparison/node_inspector.rb', line 172 def self.namespace_uri(node) return nil unless node if node.is_a?(Canon::Xml::Node) node.is_a?(Canon::Xml::Nodes::ElementNode) ? node.namespace_uri : nil else XmlParsing.namespace_uri(node) end end |
.node_type(node) ⇒ Object
Unified node type that always returns a symbol. Returns nil for unrecognised nodes.
146 147 148 149 150 151 152 153 154 155 |
# File 'lib/canon/comparison/node_inspector.rb', line 146 def self.node_type(node) return nil unless node return node.node_type if node.is_a?(Canon::Xml::Node) if node.is_a?(Canon::TreeDiff::Core::TreeNode) node.type&.to_sym else XmlParsing.node_type(node) end end |
.noise_dimension_for(node) ⇒ Object
— Noise classification —
89 90 91 92 93 94 95 |
# File 'lib/canon/comparison/node_inspector.rb', line 89 def self.noise_dimension_for(node) if whitespace_only_text?(node) :whitespace_adjacency elsif comment_node?(node) :comments end end |
.noise_node?(node) ⇒ Boolean
97 98 99 |
# File 'lib/canon/comparison/node_inspector.rb', line 97 def self.noise_node?(node) !noise_dimension_for(node).nil? end |
.parent(node) ⇒ Object
Unified parent access across all node types.
113 114 115 116 117 118 119 |
# File 'lib/canon/comparison/node_inspector.rb', line 113 def self.parent(node) return nil unless node return node.parent if node.is_a?(Canon::Xml::Node) return node.parent if node.is_a?(Canon::TreeDiff::Core::TreeNode) XmlParsing.parent(node) end |
.parent_of(node) ⇒ Object
Deprecated: use NodeInspector.parent instead.
199 200 201 |
# File 'lib/canon/comparison/node_inspector.rb', line 199 def self.parent_of(node) parent(node) end |
.parse_errors(node) ⇒ Object
Extract parse-time errors carried on a node or its owning document.
183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
# File 'lib/canon/comparison/node_inspector.rb', line 183 def self.parse_errors(node) return [] if node.nil? return Array(node.parse_errors).map(&:to_s) if node.is_a?(Canon::Xml::Node) if XmlBackend.nokogiri? if node.is_a?(Nokogiri::XML::Document) || node.is_a?(Nokogiri::HTML5::Document) Array(node.errors).map(&:to_s) else [] end else [] end end |
.text_content(node) ⇒ Object
Extract the text content of node as a String.
131 132 133 134 135 136 137 138 139 140 141 142 |
# File 'lib/canon/comparison/node_inspector.rb', line 131 def self.text_content(node) case node when Canon::Xml::Nodes::TextNode node.value.to_s when Canon::Xml::Node node.text_content.to_s when Moxml::Text node.content.to_s else XmlParsing.text_content(node).to_s end end |
.text_node?(node) ⇒ Boolean
— Type predicates —
24 25 26 27 28 29 30 31 32 33 |
# File 'lib/canon/comparison/node_inspector.rb', line 24 def self.text_node?(node) return false unless node return node.node_type == :text if node.is_a?(Canon::Xml::Node) if XmlBackend.nokogiri? node.is_a?(Nokogiri::XML::Text) || node.is_a?(Moxml::Text) else node.is_a?(Moxml::Text) end end |
.whitespace_only_text?(node) ⇒ Boolean
True when node is a text node whose content is whitespace-only. Empty-string text nodes return false — those represent genuine empty-vs-content asymmetry, not pretty-print indentation.
80 81 82 83 84 85 |
# File 'lib/canon/comparison/node_inspector.rb', line 80 def self.whitespace_only_text?(node) return false unless text_node?(node) text = text_content(node) !text.empty? && text.strip.empty? end |