Module: Canon::Comparison::XmlNodeComparison

Defined in:
lib/canon/comparison/xml_node_comparison.rb

Overview

XML Node Comparison Utilities

Provides public comparison methods for XML/HTML nodes. This module extracts shared comparison logic that was previously accessed via send() from HtmlComparator.

Class Method Summary collapse

Class Method Details

.add_difference(node1, node2, diff1, diff2, dimension, opts, differences) ⇒ Object

Add a difference to the differences array

Parameters:

  • node1 (Object)

    First node

  • node2 (Object)

    Second node

  • diff1 (Symbol)

    Difference type for node1

  • diff2 (Symbol)

    Difference type for node2

  • dimension (Symbol)

    The dimension of the difference

  • opts (Hash)

    Comparison options

  • differences (Array)

    Array to append difference to



271
272
273
274
275
276
277
278
# File 'lib/canon/comparison/xml_node_comparison.rb', line 271

def self.add_difference(node1, node2, diff1, diff2, dimension, opts,
differences)
  return unless opts[:verbose]

  require_relative "xml_comparator"
  XmlComparator.add_difference(node1, node2, diff1, diff2, dimension,
                               opts, differences)
end

.comment_node?(node) ⇒ Boolean

Check if a node is a comment node

Parameters:

  • node (Object)

    Node to check

Returns:

  • (Boolean)

    true if node is a comment



176
177
178
179
# File 'lib/canon/comparison/xml_node_comparison.rb', line 176

def self.comment_node?(node)
  node.respond_to?(:comment?) && node.comment? ||
    node.respond_to?(:node_type) && node.node_type == :comment
end

.compare_document_fragments(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol

Compare document fragments by comparing their children

Parameters:

  • node1 (Nokogiri::XML::DocumentFragment)

    First fragment

  • node2 (Nokogiri::XML::DocumentFragment)

    Second fragment

  • opts (Hash)

    Comparison options

  • child_opts (Hash)

    Options for child comparison

  • diff_children (Boolean)

    Whether to diff children

  • differences (Array)

    Array to append differences to

Returns:

  • (Symbol)

    Comparison result constant



79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# File 'lib/canon/comparison/xml_node_comparison.rb', line 79

def self.compare_document_fragments(node1, node2, opts, child_opts,
                                    diff_children, differences)
  childrenode1 = node1.children.to_a
  childrenode2 = node2.children.to_a

  if childrenode1.length != childrenode2.length
    add_difference(node1, node2, Comparison::UNEQUAL_ELEMENTS,
                   Comparison::UNEQUAL_ELEMENTS, :text_content, opts,
                   differences)
    Comparison::UNEQUAL_ELEMENTS
  elsif childrenode1.empty?
    Comparison::EQUIVALENT
  else
    # Compare each pair of children
    result = Comparison::EQUIVALENT
    childrenode1.zip(childrenode2).each do |child1, child2|
      child_result = compare_nodes(child1, child2, opts, child_opts,
                                   diff_children, differences)
      result = child_result unless result == Comparison::EQUIVALENT
    end
    result
  end
end

.compare_nodes(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol

Main comparison dispatcher for XML nodes

This method handles the high-level comparison logic, delegating to specific comparison methods based on node types.

Parameters:

  • node1 (Object)

    First node

  • node2 (Object)

    Second node

  • opts (Hash)

    Comparison options

  • child_opts (Hash)

    Options for child comparison

  • diff_children (Boolean)

    Whether to diff children

  • differences (Array)

    Array to append differences to

Returns:

  • (Symbol)

    Comparison result constant



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# File 'lib/canon/comparison/xml_node_comparison.rb', line 23

def self.compare_nodes(node1, node2, opts, child_opts, diff_children,
differences)
  # Handle DocumentFragment nodes - compare their children instead
  if node1.is_a?(Nokogiri::XML::DocumentFragment) &&
      node2.is_a?(Nokogiri::XML::DocumentFragment)
    return compare_document_fragments(node1, node2, opts, child_opts,
                                      diff_children, differences)
  end

  # Check if nodes should be excluded
  return Comparison::EQUIVALENT if node_excluded?(node1, opts) &&
    node_excluded?(node2, opts)

  if node_excluded?(node1, opts) || node_excluded?(node2, opts)
    add_difference(node1, node2, Comparison::MISSING_NODE,
                   Comparison::MISSING_NODE, :text_content, opts,
                   differences)
    return Comparison::MISSING_NODE
  end

  # Check node types match
  unless same_node_type?(node1, node2)
    add_difference(node1, node2, Comparison::UNEQUAL_NODES_TYPES,
                   Comparison::UNEQUAL_NODES_TYPES, :text_content, opts,
                   differences)
    return Comparison::UNEQUAL_NODES_TYPES
  end

  # Dispatch based on node type
  dispatch_by_node_type(node1, node2, opts, child_opts, diff_children,
                        differences)
end

.dispatch_by_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Symbol

Dispatch comparison based on node type

Parameters:

  • node1 (Object)

    First node

  • node2 (Object)

    Second node

  • opts (Hash)

    Comparison options

  • child_opts (Hash)

    Options for child comparison

  • diff_children (Boolean)

    Whether to diff children

  • differences (Array)

    Array to append differences to

Returns:

  • (Symbol)

    Comparison result constant



112
113
114
115
116
117
118
119
120
121
122
123
124
125
# File 'lib/canon/comparison/xml_node_comparison.rb', line 112

def self.dispatch_by_node_type(node1, node2, opts, child_opts,
diff_children, differences)
  # Canon::Xml::Node types use .node_type method that returns symbols
  # Nokogiri also has .node_type but returns integers, so check for Symbol
  if node1.respond_to?(:node_type) && node2.respond_to?(:node_type) &&
      node1.node_type.is_a?(Symbol) && node2.node_type.is_a?(Symbol)
    dispatch_canon_node_type(node1, node2, opts, child_opts,
                             diff_children, differences)
  # Moxml/Nokogiri types use .element?, .text?, etc. methods
  else
    dispatch_legacy_node_type(node1, node2, opts, child_opts,
                              diff_children, differences)
  end
end

.dispatch_canon_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Object

Dispatch by Canon::Xml::Node type



210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
# File 'lib/canon/comparison/xml_node_comparison.rb', line 210

def self.dispatch_canon_node_type(node1, node2, opts, child_opts,
diff_children, differences)
  # Import XmlComparator to use its comparison methods
  require_relative "xml_comparator"

  case node1.node_type
  when :root
    XmlComparator.compare_children(node1, node2, opts, child_opts,
                                   diff_children, differences)
  when :element
    XmlComparator.compare_element_nodes(node1, node2, opts, child_opts,
                                        diff_children, differences)
  when :text
    XmlComparator.compare_text_nodes(node1, node2, opts, differences)
  when :comment
    XmlComparator.compare_comment_nodes(node1, node2, opts, differences)
  when :cdata
    XmlComparator.compare_text_nodes(node1, node2, opts, differences)
  when :processing_instruction
    XmlComparator.compare_processing_instruction_nodes(node1, node2,
                                                       opts, differences)
  else
    Comparison::EQUIVALENT
  end
end

.dispatch_legacy_node_type(node1, node2, opts, child_opts, diff_children, differences) ⇒ Object

Dispatch by legacy Nokogiri/Moxml node type



237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
# File 'lib/canon/comparison/xml_node_comparison.rb', line 237

def self.dispatch_legacy_node_type(node1, node2, opts, child_opts,
diff_children, differences)
  # Import XmlComparator to use its comparison methods
  require_relative "xml_comparator"

  if node1.respond_to?(:element?) && node1.element?
    XmlComparator.compare_element_nodes(node1, node2, opts, child_opts,
                                        diff_children, differences)
  elsif node1.respond_to?(:text?) && node1.text?
    XmlComparator.compare_text_nodes(node1, node2, opts, differences)
  elsif node1.respond_to?(:comment?) && node1.comment?
    XmlComparator.compare_comment_nodes(node1, node2, opts, differences)
  elsif node1.respond_to?(:cdata?) && node1.cdata?
    XmlComparator.compare_text_nodes(node1, node2, opts, differences)
  elsif node1.respond_to?(:processing_instruction?) && node1.processing_instruction?
    XmlComparator.compare_processing_instruction_nodes(node1, node2,
                                                       opts, differences)
  elsif node1.respond_to?(:root)
    XmlComparator.compare_document_nodes(node1, node2, opts, child_opts,
                                         diff_children, differences)
  else
    Comparison::EQUIVALENT
  end
end

.filter_children(children, opts) ⇒ Array

Filter children based on options

Removes nodes that should be excluded from comparison based on options like :ignore_nodes, :ignore_comments, etc.

Parameters:

  • children (Array)

    Array of child nodes

  • opts (Hash)

    Comparison options

Returns:

  • (Array)

    Filtered array of children



64
65
66
67
68
# File 'lib/canon/comparison/xml_node_comparison.rb', line 64

def self.filter_children(children, opts)
  children.reject do |child|
    node_excluded?(child, opts)
  end
end

.node_excluded?(node, opts) ⇒ Boolean

Check if a node should be excluded from comparison

Parameters:

  • node (Object)

    Node to check

  • opts (Hash)

    Comparison options

Returns:

  • (Boolean)

    true if node should be excluded



134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
# File 'lib/canon/comparison/xml_node_comparison.rb', line 134

def self.node_excluded?(node, opts)
  return false if node.nil?
  return true if opts[:ignore_nodes]&.include?(node)
  return true if opts[:ignore_comments] && comment_node?(node)
  return true if opts[:ignore_text_nodes] && text_node?(node)

  # Check structural_whitespace match option
  match_opts = opts[:match_opts]
  return false unless match_opts

  # Filter out whitespace-only text nodes based on structural_whitespace setting
  # - :ignore or :normalize: Filter all whitespace-only text nodes
  # - :strict: Preserve all whitespace-only text nodes (don't filter any)
  if text_node?(node) && %i[ignore
                            normalize].include?(match_opts[:structural_whitespace])
    text = node_text(node)
    return true if MatchOptions.normalize_text(text).empty?
  end

  false
end

.node_text(node) ⇒ String

Extract text content from a node

Parameters:

  • node (Object)

    Node to extract text from

Returns:

  • (String)

    Text content



195
196
197
198
199
200
201
202
203
204
205
206
207
# File 'lib/canon/comparison/xml_node_comparison.rb', line 195

def self.node_text(node)
  return "" unless node

  if node.respond_to?(:content)
    node.content.to_s
  elsif node.respond_to?(:text)
    node.text.to_s
  elsif node.respond_to?(:value)
    node.value.to_s
  else
    ""
  end
end

.same_node_type?(node1, node2) ⇒ Boolean

Check if two nodes are of the same type

Parameters:

  • node1 (Object)

    First node

  • node2 (Object)

    Second node

Returns:

  • (Boolean)

    true if nodes are same type



161
162
163
164
165
166
167
168
169
170
# File 'lib/canon/comparison/xml_node_comparison.rb', line 161

def self.same_node_type?(node1, node2)
  return false if node1.class != node2.class

  # For Nokogiri/Canon::Xml nodes, check node type
  if node1.respond_to?(:node_type) && node2.respond_to?(:node_type)
    node1.node_type == node2.node_type
  else
    true
  end
end

.serialize_node_to_xml(node) ⇒ String

Serialize a Canon::Xml::Node to XML string

This utility method handles serialization of different node types to their string representation for display and debugging purposes.

Parameters:

Returns:

  • (String)

    XML string representation



287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
# File 'lib/canon/comparison/xml_node_comparison.rb', line 287

def self.serialize_node_to_xml(node)
  if node.is_a?(Canon::Xml::Nodes::RootNode)
    # Serialize all children of root
    node.children.map { |child| serialize_node_to_xml(child) }.join
  elsif node.is_a?(Canon::Xml::Nodes::ElementNode)
    # Serialize element with attributes and children
    attrs = node.attribute_nodes.map do |a|
      " #{a.name}=\"#{a.value}\""
    end.join
    children_xml = node.children.map do |c|
      serialize_node_to_xml(c)
    end.join

    if children_xml.empty?
      "<#{node.name}#{attrs}/>"
    else
      "<#{node.name}#{attrs}>#{children_xml}</#{node.name}>"
    end
  elsif node.is_a?(Canon::Xml::Nodes::TextNode)
    node.value
  elsif node.is_a?(Canon::Xml::Nodes::CommentNode)
    "<!--#{node.value}-->"
  elsif node.is_a?(Canon::Xml::Nodes::ProcessingInstructionNode)
    "<?#{node.target} #{node.data}?>"
  elsif node.respond_to?(:to_xml)
    node.to_xml
  else
    node.to_s
  end
end

.text_node?(node) ⇒ Boolean

Check if a node is a text node

Parameters:

  • node (Object)

    Node to check

Returns:

  • (Boolean)

    true if node is a text node



185
186
187
188
189
# File 'lib/canon/comparison/xml_node_comparison.rb', line 185

def self.text_node?(node)
  node.respond_to?(:text?) && node.text? &&
    !node.respond_to?(:element?) ||
    node.respond_to?(:node_type) && node.node_type == :text
end