Module: Canon::Comparison

Defined in:
lib/canon/comparison.rb,
lib/canon/comparison/dimensions.rb,
lib/canon/comparison/xml_parser.rb,
lib/canon/comparison/html_parser.rb,
lib/canon/comparison/json_parser.rb,
lib/canon/comparison/match_options.rb,
lib/canon/comparison/xml_comparator.rb,
lib/canon/comparison/base_comparator.rb,
lib/canon/comparison/compare_profile.rb,
lib/canon/comparison/format_detector.rb,
lib/canon/comparison/html_comparator.rb,
lib/canon/comparison/json_comparator.rb,
lib/canon/comparison/yaml_comparator.rb,
lib/canon/comparison/comparison_result.rb,
lib/canon/comparison/markup_comparator.rb,
lib/canon/comparison/profile_definition.rb,
lib/canon/comparison/dimensions/registry.rb,
lib/canon/comparison/xml_node_comparison.rb,
lib/canon/comparison/html_compare_profile.rb,
lib/canon/comparison/ruby_object_comparator.rb,
lib/canon/comparison/whitespace_sensitivity.rb,
lib/canon/comparison/dimensions/base_dimension.rb,
lib/canon/comparison/match_options/xml_resolver.rb,
lib/canon/comparison/xml_comparator/node_parser.rb,
lib/canon/comparison/match_options/base_resolver.rb,
lib/canon/comparison/match_options/json_resolver.rb,
lib/canon/comparison/match_options/yaml_resolver.rb,
lib/canon/comparison/dimensions/comments_dimension.rb,
lib/canon/comparison/strategies/base_match_strategy.rb,
lib/canon/comparison/xml_comparator/attribute_filter.rb,
lib/canon/comparison/xml_comparator/child_comparison.rb,
lib/canon/comparison/xml_comparator/diff_node_builder.rb,
lib/canon/comparison/dimensions/text_content_dimension.rb,
lib/canon/comparison/strategies/match_strategy_factory.rb,
lib/canon/comparison/xml_comparator/attribute_comparator.rb,
lib/canon/comparison/xml_comparator/namespace_comparator.rb,
lib/canon/comparison/xml_comparator/node_type_comparator.rb,
lib/canon/comparison/dimensions/attribute_order_dimension.rb,
lib/canon/comparison/dimensions/attribute_values_dimension.rb,
lib/canon/comparison/dimensions/element_position_dimension.rb,
lib/canon/comparison/dimensions/attribute_presence_dimension.rb,
lib/canon/comparison/strategies/semantic_tree_match_strategy.rb,
lib/canon/comparison/dimensions/structural_whitespace_dimension.rb

Overview

Comparison module for XML, HTML, JSON, and YAML documents

This module provides a unified comparison API for multiple serialization formats. It auto-detects the format and delegates to specialized comparators while maintaining a CompareXML-compatible API.

Supported Formats

  • XML: Uses Moxml for parsing, supports namespaces

  • HTML: Uses Nokogiri, handles HTML4/HTML5 differences

  • JSON: Direct Ruby object comparison with deep equality

  • YAML: Parses to Ruby objects, compares semantically

Format Detection

The module automatically detects format from:

  • Object type (Moxml::Node, Nokogiri::HTML::Document, Hash, Array)

  • String content (DOCTYPE, opening tags, YAML/JSON syntax)

Comparison Options

Common options across all formats:

  • profile: Comparison profile (Symbol for preset, Hash for custom)

    • Presets: :strict, :rendered, :html4, :html5, :spec_friendly, :content_only

    • Custom: { text_content: :normalize, comments: :ignore, … }

  • diff_algorithm: Algorithm to use (:dom or :semantic, default: :dom)

  • verbose: Return detailed diff array (default: false)

Usage Examples

# XML comparison with default profile
Canon::Comparison.equivalent?(xml1, xml2)

# XML comparison with preset profile
Canon::Comparison.equivalent?(xml1, xml2, profile: :strict)
Canon::Comparison.equivalent?(xml1, xml2, profile: :spec_friendly)

# HTML comparison with custom inline profile
Canon::Comparison.equivalent?(html1, html2,
  profile: { text_content: :normalize, comments: :ignore })

# Define and use a custom profile
Canon::Comparison.define_profile(:my_custom) do
  text_content :normalize
  comments :ignore
  preprocessing :rendered
end
Canon::Comparison.equivalent?(doc1, doc2, profile: :my_custom)

# JSON comparison with semantic tree diff
Canon::Comparison.equivalent?(json1, json2,
  diff_algorithm: :semantic, profile: :spec_friendly)

# With detailed output
diffs = Canon::Comparison.equivalent?(doc1, doc2, verbose: true)
diffs.each { |diff| puts diff.inspect }

XML Declaration Handling

XML declarations (‘<?xml version=“1.0” encoding=“UTF-8”?>`) are stripped during preprocessing for semantic comparison. This means:

  • Documents with and without declarations are considered equivalent

  • Declaration encoding differences are ignored

  • Entity declarations within DTD are resolved before comparison

This behavior ensures documents are compared by their content, not their serialization format.

Return Values

  • When verbose: false (default) → Boolean (true if equivalent)

  • When verbose: true → Array of difference hashes with details

Difference Hash Format

Each difference contains:

  • node1, node2: The nodes being compared (XML/HTML)

  • diff1, diff2: Comparison result codes

  • OR for JSON/YAML:

  • path: String path to the difference (e.g., “user.address.city”)

  • value1, value2: The differing values

  • diff_code: Type of difference

Defined Under Namespace

Modules: BaseComparator, Dimensions, MatchOptions, RubyObjectComparator, Strategies, WhitespaceSensitivity, XmlComparatorHelpers, XmlNodeComparison Classes: CompareProfile, ComparisonResult, DiffNodeBuilder, FormatDetector, HtmlComparator, HtmlCompareProfile, HtmlParser, JsonComparator, JsonParser, MarkupComparator, ProfileDefinition, ProfileError, ResolvedMatchOptions, XmlComparator, XmlParser, YamlComparator

Constant Summary collapse

EQUIVALENT =

Comparison result constants

1
MISSING_ATTRIBUTE =
2
MISSING_NODE =
3
UNEQUAL_ATTRIBUTES =
4
UNEQUAL_COMMENTS =
5
UNEQUAL_DOCUMENTS =
6
UNEQUAL_ELEMENTS =
7
UNEQUAL_NODES_TYPES =
8
UNEQUAL_TEXT_CONTENTS =
9
MISSING_HASH_KEY =
10
UNEQUAL_HASH_VALUES =
11
UNEQUAL_HASH_KEY_ORDER =
12
UNEQUAL_ARRAY_LENGTHS =
13
UNEQUAL_ARRAY_ELEMENTS =
14
UNEQUAL_TYPES =
15
UNEQUAL_PRIMITIVES =
16

Class Method Summary collapse

Class Method Details

.available_profilesArray<Symbol>

List all available profiles (custom + presets)

Returns:

  • (Array<Symbol>)

    Available profile names



191
192
193
194
195
# File 'lib/canon/comparison.rb', line 191

def available_profiles
  custom = @custom_profiles&.keys || []
  presets = MatchOptions::Xml::MATCH_PROFILES.keys
  (custom + presets).sort.uniq
end

.define_profile(name) {|ProfileDefinition| ... } ⇒ Symbol

Define a custom comparison profile with DSL syntax

Examples:

Define a custom profile

Canon::Comparison.define_profile(:my_custom) do
  text_content :normalize
  comments :ignore
  preprocessing :rendered
end

Parameters:

  • name (Symbol)

    Profile name

Yields:

Returns:

  • (Symbol)

    Profile name

Raises:



160
161
162
163
164
165
166
167
# File 'lib/canon/comparison.rb', line 160

def define_profile(name, &block)
  definition = ProfileDefinition.define(name, &block)

  @custom_profiles ||= {}
  @custom_profiles[name] = definition

  name
end

.equivalent?(obj1, obj2, opts = {}) ⇒ Boolean, Array

Auto-detect format and compare two objects

Parameters:

  • obj1 (Object)

    First object to compare

  • obj2 (Object)

    Second object to compare

  • opts (Hash) (defaults to: {})

    Comparison options

    • :profile - Profile to use (Symbol for preset, Hash for custom)

    • :format - Format hint (:xml, :html, :html4, :html5, :json, :yaml, :string)

    • :diff_algorithm - Algorithm to use (:dom or :semantic)

    • :verbose - Return detailed diff array (default: false)

Returns:

  • (Boolean, Array)

    true if equivalent, or array of diffs if verbose



136
137
138
139
140
141
142
143
144
145
# File 'lib/canon/comparison.rb', line 136

def equivalent?(obj1, obj2, opts = {})
  # Check if semantic tree diff is requested
  # Support both :semantic and :semantic_tree for backward compatibility
  if %i[semantic semantic_tree].include?(opts[:diff_algorithm])
    return semantic_diff(obj1, obj2, opts)
  end

  # Otherwise use DOM-based comparison (default)
  dom_diff(obj1, obj2, opts)
end

.load_profile(name) ⇒ Hash

Load a profile (custom or preset)

Parameters:

  • name (Symbol)

    Profile name

Returns:

  • (Hash)

    Profile settings



173
174
175
176
177
178
179
180
181
182
183
184
185
186
# File 'lib/canon/comparison.rb', line 173

def load_profile(name)
  # Check custom profiles first
  if @custom_profiles&.key?(name)
    return @custom_profiles[name].dup
  end

  # Fall back to presets - try Xml first (most common)
  begin
    MatchOptions::Xml.get_profile_options(name)
  rescue Error
    # Try other formats
    MatchOptions::Json.get_profile_options(name)
  end
end