Module: Canon::Comparison::NodeInspector
- Defined in:
- lib/canon/comparison/node_inspector.rb
Overview
Single source of truth for cross-backend node type operations.
The comparison pipeline handles nodes from multiple sources:
-
Canon::Xml::Node (+ RootNode, ElementNode, TextNode, etc.) —custom DOM built by SAX builder and DataModel.
-
Canon::TreeDiff::Core::TreeNode — semantic tree diff nodes.
-
Backend-specific nodes (Nokogiri or Moxml) — live parsed nodes.
Architecture: NodeInspector handles Canon-native types (Canon::Xml::Node, TreeNode) directly, then delegates ALL backend-specific queries to XmlParsing. No Moxml/Nokogiri constants are referenced here — that knowledge lives exclusively in XmlParsing.
Class Method Summary collapse
- .attribute_value(node, attr_name) ⇒ Object
- .children(node) ⇒ Object
- .comment_node?(node) ⇒ Boolean
- .document?(node) ⇒ Boolean
- .document_fragment?(node) ⇒ Boolean
- .element_node?(node) ⇒ Boolean
-
.name(node) ⇒ Object
— Node queries —.
- .namespace_uri(node) ⇒ Object
- .node_type(node) ⇒ Object
-
.noise_dimension_for(node) ⇒ Object
— Noise classification —.
- .noise_node?(node) ⇒ Boolean
- .parent(node) ⇒ Object
- .parse_errors(node) ⇒ Object
- .text_content(node) ⇒ Object
-
.text_node?(node) ⇒ Boolean
— Type predicates —.
-
.whitespace_only_text?(node) ⇒ Boolean
True when
nodeis a text node whose content is whitespace-only.
Class Method Details
.attribute_value(node, attr_name) ⇒ Object
130 131 132 133 134 135 136 137 138 139 140 141 |
# File 'lib/canon/comparison/node_inspector.rb', line 130 def self.attribute_value(node, attr_name) return nil unless node if node.is_a?(Canon::Xml::Nodes::ElementNode) attr = node.attribute_nodes.find { |a| a.name == attr_name.to_s } attr&.value elsif node.is_a?(Canon::Xml::Node) nil else XmlParsing.attribute_value(node, attr_name) end end |
.children(node) ⇒ Object
107 108 109 110 111 112 113 |
# File 'lib/canon/comparison/node_inspector.rb', line 107 def self.children(node) return [] unless node return node.children if node.is_a?(Canon::Xml::Node) return node.children || [] if node.is_a?(Canon::TreeDiff::Core::TreeNode) XmlParsing.children(node) end |
.comment_node?(node) ⇒ Boolean
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# File 'lib/canon/comparison/node_inspector.rb', line 34 def self.comment_node?(node) return false unless node return node.node_type == :comment if node.is_a?(Canon::Xml::Node) if XmlBackend.nokogiri? return true if node.is_a?(Nokogiri::XML::Node) && node.comment? # HTML comments are parsed as TEXT nodes by Nokogiri if node.is_a?(Nokogiri::XML::Node) && node.text? text_stripped = text_content(node).to_s.strip.gsub("\\", "") return true if text_stripped.start_with?("<!--") && text_stripped.end_with?("-->") end false else XmlParsing.comment?(node) end end |
.document?(node) ⇒ Boolean
52 53 54 55 56 |
# File 'lib/canon/comparison/node_inspector.rb', line 52 def self.document?(node) return node.node_type == :root if node.is_a?(Canon::Xml::Node) XmlParsing.document?(node) end |
.document_fragment?(node) ⇒ Boolean
58 59 60 61 62 63 |
# File 'lib/canon/comparison/node_inspector.rb', line 58 def self.document_fragment?(node) return false unless node return false unless node.is_a?(Canon::Xml::Nodes::RootNode) node.fragment? end |
.element_node?(node) ⇒ Boolean
27 28 29 30 31 32 |
# File 'lib/canon/comparison/node_inspector.rb', line 27 def self.element_node?(node) return false unless node return node.node_type == :element if node.is_a?(Canon::Xml::Node) XmlParsing.element?(node) end |
.name(node) ⇒ Object
— Node queries —
91 92 93 94 95 96 97 |
# File 'lib/canon/comparison/node_inspector.rb', line 91 def self.name(node) return nil unless node return node.name if node.is_a?(Canon::Xml::Node) return node.label if node.is_a?(Canon::TreeDiff::Core::TreeNode) XmlParsing.name(node) end |
.namespace_uri(node) ⇒ Object
143 144 145 146 147 148 149 150 151 |
# File 'lib/canon/comparison/node_inspector.rb', line 143 def self.namespace_uri(node) return nil unless node if node.is_a?(Canon::Xml::Node) node.is_a?(Canon::Xml::Nodes::ElementNode) ? node.namespace_uri : nil else XmlParsing.namespace_uri(node) end end |
.node_type(node) ⇒ Object
122 123 124 125 126 127 128 |
# File 'lib/canon/comparison/node_inspector.rb', line 122 def self.node_type(node) return nil unless node return node.node_type if node.is_a?(Canon::Xml::Node) return node.type&.to_sym if node.is_a?(Canon::TreeDiff::Core::TreeNode) XmlParsing.node_type(node) end |
.noise_dimension_for(node) ⇒ Object
— Noise classification —
77 78 79 80 81 82 83 |
# File 'lib/canon/comparison/node_inspector.rb', line 77 def self.noise_dimension_for(node) if whitespace_only_text?(node) :whitespace_adjacency elsif comment_node?(node) :comments end end |
.noise_node?(node) ⇒ Boolean
85 86 87 |
# File 'lib/canon/comparison/node_inspector.rb', line 85 def self.noise_node?(node) !noise_dimension_for(node).nil? end |
.parent(node) ⇒ Object
99 100 101 102 103 104 105 |
# File 'lib/canon/comparison/node_inspector.rb', line 99 def self.parent(node) return nil unless node return node.parent if node.is_a?(Canon::Xml::Node) return node.parent if node.is_a?(Canon::TreeDiff::Core::TreeNode) XmlParsing.parent(node) end |
.parse_errors(node) ⇒ Object
153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
# File 'lib/canon/comparison/node_inspector.rb', line 153 def self.parse_errors(node) return [] if node.nil? return Array(node.parse_errors).map(&:to_s) if node.is_a?(Canon::Xml::Node) if XmlBackend.nokogiri? if node.is_a?(Nokogiri::XML::Document) || node.is_a?(Nokogiri::HTML5::Document) Array(node.errors).map(&:to_s) else [] end else [] end end |
.text_content(node) ⇒ Object
115 116 117 118 119 120 |
# File 'lib/canon/comparison/node_inspector.rb', line 115 def self.text_content(node) return node.value.to_s if node.is_a?(Canon::Xml::Nodes::TextNode) return node.text_content.to_s if node.is_a?(Canon::Xml::Node) XmlParsing.text_content(node).to_s end |
.text_node?(node) ⇒ Boolean
— Type predicates —
20 21 22 23 24 25 |
# File 'lib/canon/comparison/node_inspector.rb', line 20 def self.text_node?(node) return false unless node return node.node_type == :text if node.is_a?(Canon::Xml::Node) XmlParsing.text_node?(node) end |
.whitespace_only_text?(node) ⇒ Boolean
True when node is a text node whose content is whitespace-only. Empty-string text nodes return false — those represent genuine empty-vs-content asymmetry, not pretty-print indentation.
68 69 70 71 72 73 |
# File 'lib/canon/comparison/node_inspector.rb', line 68 def self.whitespace_only_text?(node) return false unless text_node?(node) text = text_content(node) !text.empty? && text.strip.empty? end |