Class: Canon::Comparison::XmlComparatorHelpers::NodeParser

Inherits:
Object
  • Object
show all
Defined in:
lib/canon/comparison/xml_comparator/node_parser.rb

Overview

Node parser with preprocessing support Handles conversion of strings and various node types to Canon::Xml::Node

Class Method Summary collapse

Class Method Details

.apply_preprocessing(xml_string, preprocessing) ⇒ String

Apply preprocessing transformation to XML string

Parameters:

  • xml_string (String)

    XML string to preprocess

  • preprocessing (Symbol)

    Preprocessing mode

Returns:

  • (String)

    Preprocessed XML string



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/canon/comparison/xml_comparator/node_parser.rb', line 44

def self.apply_preprocessing(xml_string, preprocessing)
  case preprocessing
  when :normalize
    # Normalize whitespace: collapse runs, trim lines
    xml_string.lines.map(&:strip).reject(&:empty?).join("\n")
  when :c14n
    # Canonicalize the XML
    Canon::Xml::C14n.canonicalize(xml_string, with_comments: false)
  when :format
    # Pretty format the XML
    Canon.format(xml_string, :xml)
  else
    # :none or unrecognized - use as-is
    xml_string
  end
end

.convert_from_node(node, preserve_whitespace: false) ⇒ Canon::Xml::Node

Convert from Nokogiri/Moxml node to Canon::Xml::Node

Parameters:

  • node (Object)

    Nokogiri or Moxml node

  • preserve_whitespace (Boolean) (defaults to: false)

    Whether to preserve whitespace-only text nodes

Returns:



66
67
68
69
70
71
72
73
74
75
76
77
78
# File 'lib/canon/comparison/xml_comparator/node_parser.rb', line 66

def self.convert_from_node(node, preserve_whitespace: false)
  # Convert to XML string then parse through DataModel
  xml_str = if node.respond_to?(:to_xml)
              node.to_xml
            elsif node.respond_to?(:to_s)
              node.to_s
            else
              raise Canon::Error,
                    "Unable to convert node to string: #{node.class}"
            end
  Canon::Xml::DataModel.from_xml(xml_str,
                                 preserve_whitespace: preserve_whitespace)
end

.parse(node, preprocessing = :none, preserve_whitespace: false) ⇒ Canon::Xml::Node

Parse a node from string or return as-is Applies preprocessing transformation before parsing if specified

Parameters:

  • node (String, Object)

    Node to parse

  • preprocessing (Symbol) (defaults to: :none)

    Preprocessing mode (:none, :normalize, :c14n, :format)

  • preserve_whitespace (Boolean) (defaults to: false)

    Whether to preserve whitespace-only text nodes

Returns:



18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# File 'lib/canon/comparison/xml_comparator/node_parser.rb', line 18

def self.parse(node, preprocessing = :none, preserve_whitespace: false)
  # If already a Canon::Xml::Node, return as-is
  return node if node.is_a?(Canon::Xml::Node)

  # If it's a Nokogiri or Moxml node, convert to DataModel
  unless node.is_a?(String)
    return convert_from_node(node,
                             preserve_whitespace: preserve_whitespace)
  end

  # Normalize encoding before preprocessing (UTF-16 strings can't use strip, etc.)
  node = Canon::Xml::DataModel.normalize_encoding(node)

  # Apply preprocessing to XML string before parsing
  xml_string = apply_preprocessing(node, preprocessing).strip

  # Use Canon::Xml::DataModel for parsing to get Canon::Xml::Node instances
  Canon::Xml::DataModel.from_xml(xml_string,
                                 preserve_whitespace: preserve_whitespace)
end