Class: Canon::Comparison::Strategies::SemanticTreeMatchStrategy

Inherits:
BaseMatchStrategy show all
Defined in:
lib/canon/comparison/strategies/semantic_tree_match_strategy.rb

Overview

Semantic tree matching strategy

Uses TreeDiffIntegrator for intelligent structure-aware matching. This strategy:

  1. Converts documents to tree representation

  2. Performs semantic matching via TreeDiffIntegrator

  3. Converts Operations to DiffNodes via OperationConverter

  4. Returns DiffNodes that flow through standard rendering pipeline

Key difference from DOM matching: Uses tree-based structural similarity and edit distance for matching instead of simple node-by-node comparison.

Examples:

Use semantic tree matching

strategy = SemanticTreeMatchStrategy.new(:xml, match_options)
diff_nodes = strategy.match(doc1, doc2)

Instance Attribute Summary

Attributes inherited from BaseMatchStrategy

#format, #match_options

Instance Method Summary collapse

Methods inherited from BaseMatchStrategy

#algorithm_name, #initialize

Constructor Details

This class inherits a constructor from Canon::Comparison::Strategies::BaseMatchStrategy

Instance Method Details

#match(doc1, doc2) ⇒ Array<Canon::Diff::DiffNode>

Perform semantic tree matching

Parameters:

  • doc1 (Object)

    First document (Nokogiri node, Hash, etc.)

  • doc2 (Object)

    Second document

Returns:



34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/canon/comparison/strategies/semantic_tree_match_strategy.rb', line 34

def match(doc1, doc2)
  # Create integrator with format-specific adapter
  integrator = create_integrator

  # Perform tree diff - returns Operations
  result = integrator.diff(doc1, doc2)

  # Store statistics for metadata
  @statistics = result[:statistics]

  # Convert Operations to DiffNodes using OperationConverter
  # This is the KEY FIX - ensures we use proper DiffNodes
  convert_operations_to_diff_nodes(result[:operations])
end

#metadataHash

Include tree diff statistics in metadata

Returns:

  • (Hash)

    Metadata including statistics



75
76
77
78
79
80
# File 'lib/canon/comparison/strategies/semantic_tree_match_strategy.rb', line 75

def 
  {
    tree_diff_statistics: @statistics,
    tree_diff_enabled: true,
  }
end

#preprocess_for_display(doc1, doc2) ⇒ Array<String>

Preprocess documents for display

IMPORTANT: This must use the SAME format as DomMatchStrategy to ensure consistent diff rendering.

Parameters:

  • doc1 (Object)

    First document

  • doc2 (Object)

    Second document

Returns:

  • (Array<String>)

    Preprocessed [doc1_string, doc2_string]



57
58
59
60
61
62
63
64
65
66
67
68
69
70
# File 'lib/canon/comparison/strategies/semantic_tree_match_strategy.rb', line 57

def preprocess_for_display(doc1, doc2)
  case @format
  when :xml
    preprocess_xml(doc1, doc2)
  when :html, :html4, :html5
    preprocess_html(doc1, doc2)
  when :json
    preprocess_json(doc1, doc2)
  when :yaml
    preprocess_yaml(doc1, doc2)
  else
    raise ArgumentError, "Unsupported format: #{@format}"
  end
end