Module: Canon::Comparison::WhitespaceSensitivity
- Defined in:
- lib/canon/comparison/whitespace_sensitivity.rb
Overview
Whitespace sensitivity utilities for element-level control
This module provides logic to determine whether whitespace should be preserved during comparison based on:
-
Format-specific defaults (HTML has built-in sensitive elements)
-
User-configured whitelist (elements that care about whitespace)
-
User-configured blacklist (elements that don’t care about whitespace)
-
xml:space attribute in the document itself
-
respect_xml_space flag (whether to honor or override xml:space)
Priority Order
-
respect_xml_space: false → User config only (ignore xml:space)
-
User whitelist → Use whitelist (user explicitly declared)
-
Format defaults → HTML: [:pre, :textarea, :script, :style], XML: []
-
User blacklist → Remove from defaults/whitelist
-
xml:space=“preserve” → Element is sensitive
-
xml:space=“default” → Use steps 1-4
Usage
WhitespaceSensitivity.element_sensitive?(node, opts)
=> true if whitespace should be preserved for this element
Class Method Summary collapse
-
.default_sensitive_element?(element_name, match_opts) ⇒ Boolean
Check if an element is in the default sensitive list for its format.
-
.element_sensitive?(node, opts) ⇒ Boolean
Check if an element is whitespace-sensitive based on configuration.
-
.format_default_sensitive_elements(match_opts) ⇒ Array<Symbol>
Get format-specific default sensitive elements.
-
.preserve_whitespace_node?(node, opts) ⇒ Boolean
Check if whitespace-only text node should be filtered.
Class Method Details
.default_sensitive_element?(element_name, match_opts) ⇒ Boolean
Check if an element is in the default sensitive list for its format
Convenience method for checking element sensitivity without building the full list first.
99 100 101 102 |
# File 'lib/canon/comparison/whitespace_sensitivity.rb', line 99 def default_sensitive_element?(element_name, match_opts) format_default_sensitive_elements(match_opts) .include?(element_name.to_sym) end |
.element_sensitive?(node, opts) ⇒ Boolean
Check if an element is whitespace-sensitive based on configuration
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
# File 'lib/canon/comparison/whitespace_sensitivity.rb', line 35 def element_sensitive?(node, opts) match_opts = opts[:match_opts] return false unless match_opts return false unless text_node_parent?(node) parent = node.parent # 1. Check if we should ignore xml:space (user override) if !respect_xml_space?(match_opts) return user_config_sensitive?(parent, match_opts) end # 2. Check xml:space="preserve" (document declaration) return true if xml_space_preserve?(parent) # 3. Check xml:space="default" (use configured behavior) return false if xml_space_default?(parent) # 4. Use user configuration + format defaults configured_sensitive?(parent, match_opts) end |
.format_default_sensitive_elements(match_opts) ⇒ Array<Symbol>
Get format-specific default sensitive elements
This is the SINGLE SOURCE OF TRUTH for default whitespace-sensitive elements. All other code should use this method to get the list.
76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
# File 'lib/canon/comparison/whitespace_sensitivity.rb', line 76 def format_default_sensitive_elements(match_opts) format = match_opts[:format] || :xml case format when :html, :html4, :html5 # HTML specification: these elements preserve whitespace %i[pre code textarea script style].freeze when :xml # XML has no default sensitive elements - purely user-controlled [].freeze else [].freeze end end |
.preserve_whitespace_node?(node, opts) ⇒ Boolean
Check if whitespace-only text node should be filtered
62 63 64 65 66 67 |
# File 'lib/canon/comparison/whitespace_sensitivity.rb', line 62 def preserve_whitespace_node?(node, opts) return false unless node.respond_to?(:parent) return false unless node.parent element_sensitive?(node, opts) end |