Class: Canon::Comparison::XmlComparatorHelpers::NodeParser

Inherits:
Object
  • Object
show all
Defined in:
lib/canon/comparison/xml_comparator/node_parser.rb

Overview

Node parser with preprocessing support Handles conversion of strings and various node types to Canon::Xml::Node

Class Method Summary collapse

Class Method Details

.apply_preprocessing(xml_string, preprocessing) ⇒ String

Apply preprocessing transformation to XML string

Parameters:

  • xml_string (String)

    XML string to preprocess

  • preprocessing (Symbol)

    Preprocessing mode

Returns:

  • (String)

    Preprocessed XML string



55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# File 'lib/canon/comparison/xml_comparator/node_parser.rb', line 55

def self.apply_preprocessing(xml_string, preprocessing)
  case preprocessing
  when :normalize
    # Normalize whitespace: collapse runs, trim lines
    xml_string.lines.map(&:strip).reject(&:empty?).join("\n")
  when :c14n
    # Canonicalize the XML
    Canon::Xml::C14n.canonicalize(xml_string, with_comments: false)
  when :format
    # Pretty format the XML
    Canon.format(xml_string, :xml)
  else
    # :none or unrecognized - use as-is
    xml_string
  end
end

.convert_from_node(node, preserve_whitespace: false, parser: nil) ⇒ Canon::Xml::Node

Convert from Nokogiri/Moxml node to Canon::Xml::Node

Parameters:

  • node (Object)

    Nokogiri or Moxml node

  • preserve_whitespace (Boolean) (defaults to: false)

    Whether to preserve whitespace-only text nodes

  • parser (Symbol, nil) (defaults to: nil)

    Parser backend override

Returns:



78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
# File 'lib/canon/comparison/xml_comparator/node_parser.rb', line 78

def self.convert_from_node(node, preserve_whitespace: false,
parser: nil)
  # FAST PATH: Convert Nokogiri/Moxml nodes directly without string round-trip
  if defined?(Nokogiri::XML::Node) && node.is_a?(Nokogiri::XML::Node)
    return Canon::Xml::DataModel.build_from_nokogiri(
      node, preserve_whitespace: preserve_whitespace
    )
  end

  # SLOW PATH: Fallback to string serialization for unknown node types
  xml_str = if node.respond_to?(:to_xml)
              node.to_xml
            elsif node.respond_to?(:to_s)
              node.to_s
            else
              raise Canon::Error,
                    "Unable to convert node to string: #{node.class}"
            end

  resolved_parser = parser || resolve_parser_config

  if resolved_parser == :sax
    require_relative "../../xml/sax_builder"
    Canon::Xml::SaxBuilder.parse(xml_str,
                                 preserve_whitespace: preserve_whitespace)
  else
    Canon::Xml::DataModel.from_xml(xml_str,
                                   preserve_whitespace: preserve_whitespace)
  end
end

.parse(node, preprocessing = :none, preserve_whitespace: false, parser: nil) ⇒ Canon::Xml::Node

Parse a node from string or return as-is Applies preprocessing transformation before parsing if specified

Parameters:

  • node (String, Object)

    Node to parse

  • preprocessing (Symbol) (defaults to: :none)

    Preprocessing mode (:none, :normalize, :c14n, :format)

  • preserve_whitespace (Boolean) (defaults to: false)

    Whether to preserve whitespace-only text nodes

  • parser (Symbol) (defaults to: nil)

    Parser backend (:sax or :dom, default from config)

Returns:



19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# File 'lib/canon/comparison/xml_comparator/node_parser.rb', line 19

def self.parse(node, preprocessing = :none, preserve_whitespace: false,
               parser: nil)
  # If already a Canon::Xml::Node, return as-is
  return node if node.is_a?(Canon::Xml::Node)

  # If it's a Nokogiri or Moxml node, convert to DataModel
  unless node.is_a?(String)
    return convert_from_node(node,
                             preserve_whitespace: preserve_whitespace,
                             parser: parser)
  end

  # Normalize encoding before preprocessing (UTF-16 strings can't use strip, etc.)
  node = Canon::Xml::DataModel.normalize_encoding(node)

  # Apply preprocessing to XML string before parsing
  xml_string = apply_preprocessing(node, preprocessing).strip

  # Select parser backend
  resolved_parser = parser || resolve_parser_config

  if resolved_parser == :sax
    require_relative "../../xml/sax_builder"
    Canon::Xml::SaxBuilder.parse(xml_string,
                                 preserve_whitespace: preserve_whitespace)
  else
    Canon::Xml::DataModel.from_xml(xml_string,
                                   preserve_whitespace: preserve_whitespace)
  end
end

.resolve_parser_configSymbol

Resolve parser config from global config

Returns:

  • (Symbol)

    :sax or :dom



112
113
114
115
116
# File 'lib/canon/comparison/xml_comparator/node_parser.rb', line 112

def self.resolve_parser_config
  Canon::Config.instance.xml.diff.parser
rescue StandardError
  :sax
end