Module: Canon::Comparison
- Defined in:
- lib/canon/comparison.rb,
lib/canon/comparison/dimensions.rb,
lib/canon/comparison/xml_parser.rb,
lib/canon/comparison/html_parser.rb,
lib/canon/comparison/json_parser.rb,
lib/canon/comparison/match_options.rb,
lib/canon/comparison/xml_comparator.rb,
lib/canon/comparison/base_comparator.rb,
lib/canon/comparison/compare_profile.rb,
lib/canon/comparison/format_detector.rb,
lib/canon/comparison/html_comparator.rb,
lib/canon/comparison/json_comparator.rb,
lib/canon/comparison/yaml_comparator.rb,
lib/canon/comparison/comparison_result.rb,
lib/canon/comparison/markup_comparator.rb,
lib/canon/comparison/profile_definition.rb,
lib/canon/comparison/dimensions/registry.rb,
lib/canon/comparison/xml_node_comparison.rb,
lib/canon/comparison/html_compare_profile.rb,
lib/canon/comparison/ruby_object_comparator.rb,
lib/canon/comparison/whitespace_sensitivity.rb,
lib/canon/comparison/dimensions/base_dimension.rb,
lib/canon/comparison/match_options/xml_resolver.rb,
lib/canon/comparison/xml_comparator/node_parser.rb,
lib/canon/comparison/match_options/base_resolver.rb,
lib/canon/comparison/match_options/json_resolver.rb,
lib/canon/comparison/match_options/yaml_resolver.rb,
lib/canon/comparison/dimensions/comments_dimension.rb,
lib/canon/comparison/strategies/base_match_strategy.rb,
lib/canon/comparison/xml_comparator/attribute_filter.rb,
lib/canon/comparison/xml_comparator/child_comparison.rb,
lib/canon/comparison/xml_comparator/diff_node_builder.rb,
lib/canon/comparison/dimensions/text_content_dimension.rb,
lib/canon/comparison/strategies/match_strategy_factory.rb,
lib/canon/comparison/xml_comparator/attribute_comparator.rb,
lib/canon/comparison/xml_comparator/namespace_comparator.rb,
lib/canon/comparison/xml_comparator/node_type_comparator.rb,
lib/canon/comparison/dimensions/attribute_order_dimension.rb,
lib/canon/comparison/dimensions/attribute_values_dimension.rb,
lib/canon/comparison/dimensions/element_position_dimension.rb,
lib/canon/comparison/dimensions/attribute_presence_dimension.rb,
lib/canon/comparison/strategies/semantic_tree_match_strategy.rb,
lib/canon/comparison/dimensions/structural_whitespace_dimension.rb
Overview
Comparison module for XML, HTML, JSON, and YAML documents
This module provides a unified comparison API for multiple serialization formats. It auto-detects the format and delegates to specialized comparators while maintaining a CompareXML-compatible API.
Supported Formats
-
XML: Uses Moxml for parsing, supports namespaces
-
HTML: Uses Nokogiri, handles HTML4/HTML5 differences
-
JSON: Direct Ruby object comparison with deep equality
-
YAML: Parses to Ruby objects, compares semantically
Format Detection
The module automatically detects format from:
-
Object type (Moxml::Node, Nokogiri::HTML::Document, Hash, Array)
-
String content (DOCTYPE, opening tags, YAML/JSON syntax)
Comparison Options
Common options across all formats:
-
profile: Comparison profile (Symbol for preset, Hash for custom)
-
Presets: :strict, :rendered, :html4, :html5, :spec_friendly, :content_only
-
Custom: { text_content: :normalize, comments: :ignore, … }
-
-
diff_algorithm: Algorithm to use (:dom or :semantic, default: :dom)
-
verbose: Return detailed diff array (default: false)
Usage Examples
# XML comparison with default profile
Canon::Comparison.equivalent?(xml1, xml2)
# XML comparison with preset profile
Canon::Comparison.equivalent?(xml1, xml2, profile: :strict)
Canon::Comparison.equivalent?(xml1, xml2, profile: :spec_friendly)
# HTML comparison with custom inline profile
Canon::Comparison.equivalent?(html1, html2,
profile: { text_content: :normalize, comments: :ignore })
# Define and use a custom profile
Canon::Comparison.define_profile(:my_custom) do
text_content :normalize
comments :ignore
preprocessing :rendered
end
Canon::Comparison.equivalent?(doc1, doc2, profile: :my_custom)
# JSON comparison with semantic tree diff
Canon::Comparison.equivalent?(json1, json2,
diff_algorithm: :semantic, profile: :spec_friendly)
# With detailed output
diffs = Canon::Comparison.equivalent?(doc1, doc2, verbose: true)
diffs.each { |diff| puts diff.inspect }
XML Declaration Handling
XML declarations (‘<?xml version=“1.0” encoding=“UTF-8”?>`) are stripped during preprocessing for semantic comparison. This means:
-
Documents with and without declarations are considered equivalent
-
Declaration encoding differences are ignored
-
Entity declarations within DTD are resolved before comparison
This behavior ensures documents are compared by their content, not their serialization format.
Return Values
-
When verbose: false (default) → Boolean (true if equivalent)
-
When verbose: true → Array of difference hashes with details
Difference Hash Format
Each difference contains:
-
node1, node2: The nodes being compared (XML/HTML)
-
diff1, diff2: Comparison result codes
-
OR for JSON/YAML:
-
path: String path to the difference (e.g., “user.address.city”)
-
value1, value2: The differing values
-
diff_code: Type of difference
Defined Under Namespace
Modules: BaseComparator, Dimensions, MatchOptions, RubyObjectComparator, Strategies, WhitespaceSensitivity, XmlComparatorHelpers, XmlNodeComparison Classes: CompareProfile, ComparisonResult, DiffNodeBuilder, FormatDetector, HtmlComparator, HtmlCompareProfile, HtmlParser, JsonComparator, JsonParser, MarkupComparator, ProfileDefinition, ProfileError, ResolvedMatchOptions, XmlComparator, XmlParser, YamlComparator
Constant Summary collapse
- EQUIVALENT =
Comparison result constants
1- MISSING_ATTRIBUTE =
2- MISSING_NODE =
3- UNEQUAL_ATTRIBUTES =
4- UNEQUAL_COMMENTS =
5- UNEQUAL_DOCUMENTS =
6- UNEQUAL_ELEMENTS =
7- UNEQUAL_NODES_TYPES =
8- UNEQUAL_TEXT_CONTENTS =
9- MISSING_HASH_KEY =
10- UNEQUAL_HASH_VALUES =
11- UNEQUAL_HASH_KEY_ORDER =
12- UNEQUAL_ARRAY_LENGTHS =
13- UNEQUAL_ARRAY_ELEMENTS =
14- UNEQUAL_TYPES =
15- UNEQUAL_PRIMITIVES =
16
Class Method Summary collapse
-
.available_profiles ⇒ Array<Symbol>
List all available profiles (custom + presets).
-
.define_profile(name) {|ProfileDefinition| ... } ⇒ Symbol
Define a custom comparison profile with DSL syntax.
-
.equivalent?(obj1, obj2, opts = {}) ⇒ Boolean, Array
Auto-detect format and compare two objects.
-
.load_profile(name) ⇒ Hash
Load a profile (custom or preset).
Class Method Details
.available_profiles ⇒ Array<Symbol>
List all available profiles (custom + presets)
191 192 193 194 195 |
# File 'lib/canon/comparison.rb', line 191 def available_profiles custom = @custom_profiles&.keys || [] presets = MatchOptions::Xml::MATCH_PROFILES.keys (custom + presets).sort.uniq end |
.define_profile(name) {|ProfileDefinition| ... } ⇒ Symbol
Define a custom comparison profile with DSL syntax
160 161 162 163 164 165 166 167 |
# File 'lib/canon/comparison.rb', line 160 def define_profile(name, &block) definition = ProfileDefinition.define(name, &block) @custom_profiles ||= {} @custom_profiles[name] = definition name end |
.equivalent?(obj1, obj2, opts = {}) ⇒ Boolean, Array
Auto-detect format and compare two objects
136 137 138 139 140 141 142 143 144 145 |
# File 'lib/canon/comparison.rb', line 136 def equivalent?(obj1, obj2, opts = {}) # Check if semantic tree diff is requested # Support both :semantic and :semantic_tree for backward compatibility if %i[semantic semantic_tree].include?(opts[:diff_algorithm]) return semantic_diff(obj1, obj2, opts) end # Otherwise use DOM-based comparison (default) dom_diff(obj1, obj2, opts) end |
.load_profile(name) ⇒ Hash
Load a profile (custom or preset)
173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
# File 'lib/canon/comparison.rb', line 173 def load_profile(name) # Check custom profiles first if @custom_profiles&.key?(name) return @custom_profiles[name].dup end # Fall back to presets - try Xml first (most common) begin MatchOptions::Xml.(name) rescue Error # Try other formats MatchOptions::Json.(name) end end |