Module: Canon::Comparison::NodeInspector
- Defined in:
- lib/canon/comparison/node_inspector.rb
Overview
Single source of truth for cross-backend node type operations.
The comparison pipeline handles nodes from two backends:
-
Canon::Xml::Node (+ RootNode, ElementNode, TextNode, etc.) —custom DOM built by SAX builder and DataModel.
-
Nokogiri::XML::Node (+ subclasses) — native Nokogiri nodes used by the HTML comparator and some legacy paths.
Every method here dispatches on type via case/when (is_a?). No respond_to? — the types are known at every call site.
Constant Summary collapse
- CANON_TEXT_TYPE =
:text- NOKOGIRI_TEXT_TYPE =
defined?(Nokogiri::XML::Node::TEXT_NODE) ? Nokogiri::XML::Node::TEXT_NODE : 3
Class Method Summary collapse
-
.comment_node?(node) ⇒ Boolean
True when
nodeis a comment node. -
.element_node?(node) ⇒ Boolean
True when
nodeis an element node. -
.parse_errors(node) ⇒ Object
Extract parse-time errors carried on a node or its owning document.
-
.text_content(node) ⇒ Object
Extract the text content of
nodeas a String. -
.text_node?(node) ⇒ Boolean
True when
nodeis a text node (whitespace, content, etc.). -
.whitespace_only_text?(node) ⇒ Boolean
True when
nodeis a text node whose content is whitespace-only.
Class Method Details
.comment_node?(node) ⇒ Boolean
True when node is a comment node. For HTML, also detects comments that Nokogiri parses as TEXT nodes (content like “<!– comment –>” or escaped “<\!– comment –>”).
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
# File 'lib/canon/comparison/node_inspector.rb', line 56 def self.comment_node?(node) case node when Canon::Xml::Node node.node_type == :comment when Nokogiri::XML::Node return true if node.comment? # HTML comments are parsed as TEXT nodes by Nokogiri if node.text? text_stripped = text_content(node).to_s.strip.gsub("\\", "") return true if text_stripped.start_with?("<!--") && text_stripped.end_with?("-->") end false else false end end |
.element_node?(node) ⇒ Boolean
True when node is an element node.
75 76 77 78 79 80 81 82 83 84 |
# File 'lib/canon/comparison/node_inspector.rb', line 75 def self.element_node?(node) case node when Canon::Xml::Node node.node_type == :element when Nokogiri::XML::Node node.element? else false end end |
.parse_errors(node) ⇒ Object
Extract parse-time errors carried on a node or its owning document. Returns an Array of Strings.
88 89 90 91 92 93 94 95 96 97 98 99 100 |
# File 'lib/canon/comparison/node_inspector.rb', line 88 def self.parse_errors(node) case node when nil [] when Canon::Xml::Node errors = node.parse_errors Array(errors).map(&:to_s) when Nokogiri::XML::Document, Nokogiri::HTML5::Document Array(node.errors).map(&:to_s) else [] end end |
.text_content(node) ⇒ Object
Extract the text content of node as a String.
32 33 34 35 36 37 38 39 40 41 |
# File 'lib/canon/comparison/node_inspector.rb', line 32 def self.text_content(node) case node when Canon::Xml::Node node.value.to_s when Nokogiri::XML::Node node.content.to_s else node.to_s end end |
.text_node?(node) ⇒ Boolean
True when node is a text node (whitespace, content, etc.).
20 21 22 23 24 25 26 27 28 29 |
# File 'lib/canon/comparison/node_inspector.rb', line 20 def self.text_node?(node) case node when Canon::Xml::Node node.node_type == CANON_TEXT_TYPE when Nokogiri::XML::Node node.node_type == NOKOGIRI_TEXT_TYPE else false end end |
.whitespace_only_text?(node) ⇒ Boolean
True when node is a text node whose content is whitespace-only. Empty-string text nodes return false — those represent genuine empty-vs-content asymmetry, not pretty-print indentation.
46 47 48 49 50 51 |
# File 'lib/canon/comparison/node_inspector.rb', line 46 def self.whitespace_only_text?(node) return false unless text_node?(node) text = text_content(node) !text.empty? && text.strip.empty? end |