Class: Canon::TreeDiff::TreeDiffIntegrator

Inherits:
Object
  • Object
show all
Defined in:
lib/canon/tree_diff/tree_diff_integrator.rb

Overview

TreeDiffIntegrator provides integration between Canon’s DOM diff system and the new semantic tree diff system.

This class orchestrates:

  • Format-specific adapter selection

  • Tree conversion from parsed documents

  • Tree matching via UniversalMatcher

  • Operation detection

  • Results formatting

Examples:

XML tree diff

integrator = TreeDiffIntegrator.new(format: :xml)
result = integrator.diff(doc1, doc2)
result[:operations] # => [Operation(...), ...]

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(format:, options: {}) ⇒ TreeDiffIntegrator

Initialize integrator for a specific format

Parameters:

  • format (Symbol)

    Format type (:xml, :json, :html, :yaml)

  • options (Hash) (defaults to: {})

    Configuration options (match options from Canon::Comparison)

Options Hash (options:):

  • :similarity_threshold (Float)

    Threshold for similarity matching (default: 0.95)

  • :hash_matching (Boolean)

    Enable hash matching phase (default: true)

  • :similarity_matching (Boolean)

    Enable similarity matching phase (default: true)

  • :propagation (Boolean)

    Enable propagation phase (default: true)

  • :text_content (Symbol)

    How to compare text (:strict, :normalize)

  • :attribute_order (Symbol)

    How to compare attributes (:strict, :ignore)



33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# File 'lib/canon/tree_diff/tree_diff_integrator.rb', line 33

def initialize(format:, options: {})
  @format = format
  @options = options
  @match_options = options # Store full match options for downstream use

  # Initialize format-specific adapter WITH match options
  @adapter = create_adapter(format, options)

  # Initialize matcher with options
  matcher_options = {
    similarity_threshold: options[:similarity_threshold] || 0.95,
    hash_matching: options.fetch(:hash_matching, true),
    similarity_matching: options.fetch(:similarity_matching, true),
    propagation: options.fetch(:propagation, true),
    attribute_order: options[:attribute_order] || :ignore,
  }
  @matcher = Matchers::UniversalMatcher.new(matcher_options)
end

Instance Attribute Details

#adapterObject (readonly)

Returns the value of attribute adapter.



21
22
23
# File 'lib/canon/tree_diff/tree_diff_integrator.rb', line 21

def adapter
  @adapter
end

#formatObject (readonly)

Returns the value of attribute format.



21
22
23
# File 'lib/canon/tree_diff/tree_diff_integrator.rb', line 21

def format
  @format
end

#match_optionsObject (readonly)

Returns the value of attribute match_options.



21
22
23
# File 'lib/canon/tree_diff/tree_diff_integrator.rb', line 21

def match_options
  @match_options
end

#matcherObject (readonly)

Returns the value of attribute matcher.



21
22
23
# File 'lib/canon/tree_diff/tree_diff_integrator.rb', line 21

def matcher
  @matcher
end

Instance Method Details

#diff(doc1, doc2) ⇒ Hash

Perform tree diff on two documents

Parameters:

  • doc1 (Object)

    First document (format-specific)

  • doc2 (Object)

    Second document (format-specific)

Returns:

  • (Hash)

    Diff results with :operations, :matching, :statistics



57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# File 'lib/canon/tree_diff/tree_diff_integrator.rb', line 57

def diff(doc1, doc2)
  # Convert documents to tree nodes
  tree1 = @adapter.to_tree(doc1)
  tree2 = @adapter.to_tree(doc2)

  # Check node count limits
  check_node_count_limit(tree1)
  check_node_count_limit(tree2)

  # Match trees
  matching = @matcher.match(tree1, tree2)

  # Detect operations with match_options for proper normalization
  detector = Operations::OperationDetector.new(tree1, tree2, matching,
                                               @match_options)
  operations = detector.detect

  # Return comprehensive results
  {
    operations: operations,
    matching: matching,
    statistics: @matcher.statistics,
    trees: { tree1: tree1, tree2: tree2 },
  }
end

#equivalent?(doc1, doc2) ⇒ Boolean

Check if two documents are semantically equivalent

Parameters:

  • doc1 (Object)

    First document

  • doc2 (Object)

    Second document

Returns:

  • (Boolean)

    true if no operations detected



88
89
90
91
# File 'lib/canon/tree_diff/tree_diff_integrator.rb', line 88

def equivalent?(doc1, doc2)
  result = diff(doc1, doc2)
  result[:operations].empty?
end