Module: Canon::Comparison::XmlNodeComparison
- Defined in:
- lib/canon/comparison/xml_node_comparison.rb
Overview
XML Node Comparison Utilities
Provides public comparison methods for XML/HTML nodes. This module extracts shared comparison logic that was previously accessed via send() from HtmlComparator.
This is a simple utility module with focused responsibilities.
Class Method Summary collapse
-
.add_difference(node1, node2, diff1, diff2, dimension, opts, differences) ⇒ Object
Add a difference to the differences array.
-
.comment_node?(node, check_children: false) ⇒ Boolean
Check if a node is a comment node.
-
.comment_vs_non_comment_comparison?(node1, node2) ⇒ Boolean
Check if this is a comment vs non-comment comparison.
-
.compare_document_fragments(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol
Compare document fragments by comparing their children.
-
.compare_nodes(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol
Main comparison dispatcher for XML nodes.
-
.dispatch_by_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol
Dispatch comparison based on node type.
-
.dispatch_canon_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Object
Dispatch by Canon::Xml::Node type.
-
.dispatch_legacy_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Object
Dispatch by legacy Nokogiri/Moxml node type.
-
.filter_children(children, opts) ⇒ Object
Filter children — delegates to MarkupComparator.
-
.node_excluded?(node, opts) ⇒ Boolean
Check if a node should be excluded — delegates to MarkupComparator.
-
.node_text(node) ⇒ String
Extract text content from a node.
-
.opts_for_side(opts, side) ⇒ Hash
Build a side-specific opts copy that activates the pretty-print structural-whitespace heuristic for the given side.
-
.same_node_type?(node1, node2) ⇒ Boolean
Check if two nodes are of the same type.
-
.text_node?(node) ⇒ Boolean
Check if a node is a text node.
Class Method Details
.add_difference(node1, node2, diff1, diff2, dimension, opts, differences) ⇒ Object
Add a difference to the differences array
323 324 325 326 327 328 329 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 323 def self.add_difference(node1, node2, diff1, diff2, dimension, opts, differences) return unless opts[:verbose] XmlComparator.add_difference(node1, node2, diff1, diff2, dimension, opts, differences) end |
.comment_node?(node, check_children: false) ⇒ Boolean
Check if a node is a comment node
For XML/XHTML, this checks the node’s comment? method or node_type. For HTML, this also checks TEXT nodes that contain HTML-style comments (Nokogiri parses HTML comments as TEXT nodes with content like “<!– comment –>” or escaped like “<\!– comment –>” in full HTML documents).
237 238 239 240 241 242 243 244 245 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 237 def self.comment_node?(node, check_children: false) return true if NodeInspector.comment_node?(node) if check_children && Canon::XmlParsing.element?(node) && !Canon::XmlParsing.children(node).empty? node.children.any? { |child| NodeInspector.comment_node?(child) } else false end end |
.comment_vs_non_comment_comparison?(node1, node2) ⇒ Boolean
Check if this is a comment vs non-comment comparison
This handles the case where zip pairs a comment with a non-comment node due to different lengths in the children arrays. We create a :comments dimension difference instead of UNEQUAL_NODES_TYPES.
199 200 201 202 203 204 205 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 199 def self.comment_vs_non_comment_comparison?(node1, node2) node1_comment = comment_node?(node1, check_children: true) node2_comment = comment_node?(node2, check_children: true) # XOR: exactly one is a comment node1_comment ^ node2_comment end |
.compare_document_fragments(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol
Compare document fragments by comparing their children
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 126 def self.compare_document_fragments(node1, node2, opts, child_opts, diff_children, differences) childrenode1 = node1.children.to_a childrenode2 = node2.children.to_a # Filter children before comparison to handle ignored nodes (like comments with :ignore). # Apply side-specific pretty-print heuristic when the relevant flag is active. children1 = filter_children(childrenode1, opts_for_side(opts, :expected)) children2 = filter_children(childrenode2, opts_for_side(opts, :received)) if children1.length != children2.length add_difference(node1, node2, Comparison::UNEQUAL_ELEMENTS, Comparison::UNEQUAL_ELEMENTS, :text_content, opts, differences) # Continue comparing children to find deeper differences like attribute values # Use zip to compare up to the shorter length end if children1.empty? && children2.empty? Comparison::EQUIVALENT else # Compare each pair of children (up to the shorter length) result = Comparison::EQUIVALENT children1.zip(children2).each do |child1, child2| # Skip if one is nil (due to different lengths) next if child1.nil? || child2.nil? child_result = compare_nodes(child1, child2, opts, child_opts, diff_children, differences) result = child_result unless result == Comparison::EQUIVALENT end result end end |
.compare_nodes(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol
Main comparison dispatcher for XML nodes
This method handles the high-level comparison logic, delegating to specific comparison methods based on node types.
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 25 def self.compare_nodes(node1, node2, opts, child_opts, diff_children, differences) # Handle DocumentFragment nodes - compare their children instead if Canon::XmlParsing.document_fragment?(node1) && Canon::XmlParsing.document_fragment?(node2) return compare_document_fragments(node1, node2, opts, child_opts, diff_children, differences) end # Check if nodes should be excluded return Comparison::EQUIVALENT if node_excluded?(node1, opts) && node_excluded?(node2, opts) if node_excluded?(node1, opts) || node_excluded?(node2, opts) add_difference(node1, node2, Comparison::MISSING_NODE, Comparison::MISSING_NODE, :text_content, opts, differences) return Comparison::MISSING_NODE end # Handle comment vs non-comment comparisons specially # When comparing a comment with a non-comment node (due to zip pairing), # create a :comments dimension difference instead of UNEQUAL_NODES_TYPES if comment_vs_non_comment_comparison?(node1, node2) match_opts = opts[:match_opts] comment_behavior = match_opts ? match_opts[:comments] : nil # Create a :comments dimension difference # The difference will be marked as normative or not based on the HtmlCompareProfile add_difference(node1, node2, Comparison::MISSING_NODE, Comparison::MISSING_NODE, :comments, opts, differences) # Return EQUIVALENT if comments are ignored, otherwise return UNEQUAL if comment_behavior == :ignore Comparison::EQUIVALENT else Comparison::UNEQUAL_COMMENTS end end # Check node types match unless same_node_type?(node1, node2) add_difference(node1, node2, Comparison::UNEQUAL_NODES_TYPES, Comparison::UNEQUAL_NODES_TYPES, :text_content, opts, differences) return Comparison::UNEQUAL_NODES_TYPES end # Dispatch based on node type dispatch_by_node_type(node1, node2, opts, child_opts, diff_children, differences) end |
.dispatch_by_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol
Dispatch comparison based on node type
172 173 174 175 176 177 178 179 180 181 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 172 def self.dispatch_by_node_type(node1, node2, opts, child_opts, diff_children, differences) if node1.is_a?(Canon::Xml::Node) && node2.is_a?(Canon::Xml::Node) dispatch_canon_node_type(node1, node2, opts, child_opts, diff_children, differences) else dispatch_legacy_node_type(node1, node2, opts, child_opts, diff_children, differences) end end |
.dispatch_canon_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Object
Dispatch by Canon::Xml::Node type
266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 266 def self.dispatch_canon_node_type(node1, node2, opts, child_opts, diff_children, differences) case node1.node_type when :root XmlComparator.compare_children(node1, node2, opts, child_opts, diff_children, differences) when :element XmlComparator.compare_element_nodes(node1, node2, opts, child_opts, diff_children, differences) when :text XmlComparator.compare_text_nodes(node1, node2, opts, differences) when :comment XmlComparator.compare_comment_nodes(node1, node2, opts, differences) when :cdata XmlComparator.compare_text_nodes(node1, node2, opts, differences) when :processing_instruction XmlComparator.compare_processing_instruction_nodes(node1, node2, opts, differences) else Comparison::EQUIVALENT end end |
.dispatch_legacy_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Object
Dispatch by legacy Nokogiri/Moxml node type
290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 290 def self.dispatch_legacy_node_type(node1, node2, opts, child_opts, diff_children, differences) if Canon::XmlParsing.document?(node1) XmlComparator.compare_document_nodes(node1, node2, opts, child_opts, diff_children, differences) elsif Canon::XmlParsing.xml_node?(node1) if Canon::XmlParsing.element?(node1) XmlComparator.compare_element_nodes(node1, node2, opts, child_opts, diff_children, differences) elsif Canon::XmlParsing.text_node?(node1) || Canon::XmlParsing.cdata?(node1) XmlComparator.compare_text_nodes(node1, node2, opts, differences) elsif Canon::XmlParsing.comment?(node1) XmlComparator.compare_comment_nodes(node1, node2, opts, differences) elsif Canon::XmlParsing.processing_instruction?(node1) XmlComparator.compare_processing_instruction_nodes(node1, node2, opts, differences) else Comparison::EQUIVALENT end else Comparison::EQUIVALENT end end |
.filter_children(children, opts) ⇒ Object
Filter children — delegates to MarkupComparator.
80 81 82 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 80 def self.filter_children(children, opts) MarkupComparator.filter_children(children, opts) end |
.node_excluded?(node, opts) ⇒ Boolean
Check if a node should be excluded — delegates to MarkupComparator.
186 187 188 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 186 def self.node_excluded?(node, opts) MarkupComparator.node_excluded?(node, opts) end |
.node_text(node) ⇒ String
Extract text content from a node
259 260 261 262 263 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 259 def self.node_text(node) return "" unless node NodeInspector.text_content(node) end |
.opts_for_side(opts, side) ⇒ Hash
Build a side-specific opts copy that activates the pretty-print structural-whitespace heuristic for the given side.
When pretty_printed_expected (side :expected) or pretty_printed_received (side :received) is truthy in match_opts, returns a shallow copy of opts with an ephemeral _pretty_print_side_active: true flag merged into :match_opts. Otherwise returns opts unchanged (no allocation overhead).
The flag is consumed by node_excluded? to drop whitespace-only text nodes that start with “n” in :normalize whitespace elements. It is intentionally NOT propagated to recursive compare_nodes calls —each level of ChildComparison.compare re-evaluates it from the original pretty_printed_* flags.
102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 102 def self.opts_for_side(opts, side) match_opts = opts[:match_opts] return opts unless match_opts active = case side when :expected then match_opts[:pretty_printed_expected] when :received then match_opts[:pretty_printed_received] else false end return opts unless active opts.merge(match_opts: match_opts.merge(_pretty_print_side_active: true)) end |
.same_node_type?(node1, node2) ⇒ Boolean
Check if two nodes are of the same type
212 213 214 215 216 217 218 219 220 221 222 223 224 225 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 212 def self.same_node_type?(node1, node2) return false if node1.class != node2.class case node1 when Canon::Xml::Node node1.node_type == node2.node_type else if Canon::XmlBackend.nokogiri? node1.is_a?(Nokogiri::XML::Node) && node1.node_type == node2.node_type else Canon::XmlParsing.xml_node?(node1) && Canon::XmlParsing.node_type(node1) == Canon::XmlParsing.node_type(node2) end end end |
.text_node?(node) ⇒ Boolean
Check if a node is a text node
251 252 253 |
# File 'lib/canon/comparison/xml_node_comparison.rb', line 251 def self.text_node?(node) NodeInspector.text_node?(node) end |