Class: Canon::Diff::DiffClassifier
- Inherits:
-
Object
- Object
- Canon::Diff::DiffClassifier
- Defined in:
- lib/canon/diff/diff_classifier.rb
Overview
Classifies DiffNodes as normative (affects equivalence) or informative (doesn’t affect equivalence) based on the match options in effect
Classification hierarchy (three distinct kinds of differences):
-
Serialization formatting: XML syntax differences (always non-normative)
-
Content formatting: Whitespace differences in content (non-normative when normalized)
-
Normative: Semantic content differences (affect equivalence)
Instance Attribute Summary collapse
-
#match_options ⇒ Object
readonly
Returns the value of attribute match_options.
-
#profile ⇒ Object
readonly
Returns the value of attribute profile.
Instance Method Summary collapse
-
#classify(diff_node) ⇒ DiffNode
Classify a single DiffNode as normative or informative Hierarchy: formatting-only < informative < normative CompareProfile determines base classification, XmlSerializationFormatter handles serialization formatting.
-
#classify_all(diff_nodes) ⇒ Array<DiffNode>
Classify multiple DiffNodes.
-
#initialize(match_options) ⇒ DiffClassifier
constructor
A new instance of DiffClassifier.
Constructor Details
#initialize(match_options) ⇒ DiffClassifier
Returns a new instance of DiffClassifier.
21 22 23 24 25 26 27 28 29 30 |
# File 'lib/canon/diff/diff_classifier.rb', line 21 def initialize() @match_options = # Use the compare_profile from ResolvedMatchOptions if available (e.g., HtmlCompareProfile) # Otherwise create a base CompareProfile @profile = if .respond_to?(:compare_profile) && .compare_profile .compare_profile else Canon::Comparison::CompareProfile.new() end end |
Instance Attribute Details
#match_options ⇒ Object (readonly)
Returns the value of attribute match_options.
18 19 20 |
# File 'lib/canon/diff/diff_classifier.rb', line 18 def @match_options end |
#profile ⇒ Object (readonly)
Returns the value of attribute profile.
18 19 20 |
# File 'lib/canon/diff/diff_classifier.rb', line 18 def profile @profile end |
Instance Method Details
#classify(diff_node) ⇒ DiffNode
Classify a single DiffNode as normative or informative Hierarchy: formatting-only < informative < normative CompareProfile determines base classification, XmlSerializationFormatter handles serialization formatting
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
# File 'lib/canon/diff/diff_classifier.rb', line 37 def classify(diff_node) # FIRST: Check for XML serialization-level formatting differences # These are ALWAYS non-normative (formatting-only) regardless of match options # Examples: self-closing tags (<tag/>) vs explicit closing tags (<tag></tag>) # # EXCEPTION: If the text node is inside a whitespace-sensitive element # (:preserve or :collapse), don't dismiss as serialization formatting # because whitespace presence is meaningful in those elements. if !inside_whitespace_sensitive_element?(diff_node) && XmlSerializationFormatter.serialization_formatting?(diff_node) diff_node.formatting = true diff_node.normative = false return diff_node end # SECOND: Handle content-level formatting for text_content with :normalize behavior # When text_content is :normalize and the difference is formatting-only, # it should be marked as non-normative (informative) # This ensures that verbose and non-verbose modes give consistent results # # EXCEPTION: If the text node is inside a PRESERVE whitespace element # (like <pre>, <code>, <textarea> in HTML), don't apply formatting detection # because whitespace should be preserved exactly in these elements. # Note: COLLAPSE elements like <p> DO get formatting detection because # their whitespace IS normalized (differences are formatting-only). # # This check must come BEFORE normative_dimension? is called, # because normative_dimension? returns true for text_content: :normalize # (since the dimension affects equivalence), which would prevent formatting # detection from being applied. if diff_node.dimension == :text_content && profile.send(:behavior_for, :text_content) == :normalize && !inside_preserve_element?(diff_node) && formatting_only_diff?(diff_node) diff_node.formatting = true diff_node.normative = false return diff_node end # THIRD: Determine if this dimension is normative based on CompareProfile # This respects the policy settings (strict/normalize/ignore) is_normative = profile.normative_dimension?(diff_node.dimension) # FOURTH: Check if FormattingDetector should be consulted for non-normative dimensions # Only check for formatting-only when dimension is NOT normative # This ensures strict mode differences remain normative should_check_formatting = !is_normative && profile.supports_formatting_detection?(diff_node.dimension) # If we should check formatting, see if it's formatting-only if should_check_formatting && formatting_only_diff?(diff_node) diff_node.formatting = true diff_node.normative = false return diff_node end # FIFTH: Apply the normative determination from CompareProfile diff_node.formatting = false diff_node.normative = is_normative diff_node end |
#classify_all(diff_nodes) ⇒ Array<DiffNode>
Classify multiple DiffNodes
103 104 105 |
# File 'lib/canon/diff/diff_classifier.rb', line 103 def classify_all(diff_nodes) diff_nodes.each { |node| classify(node) } end |