Class: Canon::TreeDiff::Adapters::HTMLAdapter
- Inherits:
-
Object
- Object
- Canon::TreeDiff::Adapters::HTMLAdapter
- Defined in:
- lib/canon/tree_diff/adapters/html_adapter.rb
Overview
HTMLAdapter converts Nokogiri HTML documents to TreeNode structures and back, enabling semantic tree diffing on HTML documents.
This adapter:
-
Converts Nokogiri::HTML::Document to TreeNode tree
-
Preserves element names, text content, and attributes
-
Handles HTML-specific elements (script, style, etc.)
-
Maintains document structure for round-trip conversion
Instance Attribute Summary collapse
-
#match_options ⇒ Object
readonly
Returns the value of attribute match_options.
Instance Method Summary collapse
-
#from_tree(tree_node, doc = nil) ⇒ Nokogiri::HTML::Document, Nokogiri::XML::Element
Convert TreeNode back to Nokogiri HTML.
-
#initialize(match_options: {}) ⇒ HTMLAdapter
constructor
Initialize adapter with match options.
-
#to_tree(node) ⇒ Core::TreeNode
Convert Nokogiri HTML document/element or Canon::Xml::Node to TreeNode.
Constructor Details
#initialize(match_options: {}) ⇒ HTMLAdapter
Initialize adapter with match options
28 29 30 |
# File 'lib/canon/tree_diff/adapters/html_adapter.rb', line 28 def initialize(match_options: {}) @match_options = end |
Instance Attribute Details
#match_options ⇒ Object (readonly)
Returns the value of attribute match_options.
23 24 25 |
# File 'lib/canon/tree_diff/adapters/html_adapter.rb', line 23 def @match_options end |
Instance Method Details
#from_tree(tree_node, doc = nil) ⇒ Nokogiri::HTML::Document, Nokogiri::XML::Element
Convert TreeNode back to Nokogiri HTML
70 71 72 73 74 75 76 77 78 79 80 81 |
# File 'lib/canon/tree_diff/adapters/html_adapter.rb', line 70 def from_tree(tree_node, doc = nil) doc ||= Nokogiri::HTML::Document.new element = build_element(tree_node, doc) if doc.root.nil? doc.root = element doc else element end end |
#to_tree(node) ⇒ Core::TreeNode
Convert Nokogiri HTML document/element or Canon::Xml::Node to TreeNode
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
# File 'lib/canon/tree_diff/adapters/html_adapter.rb', line 36 def to_tree(node) # Handle Canon::Xml::Node types first (same as XML adapter) case node when Canon::Xml::Nodes::RootNode return to_tree_from_canon_root(node) when Canon::Xml::Nodes::ElementNode return to_tree_from_canon_element(node) when Canon::Xml::Nodes::TextNode return to_tree_from_canon_text(node) when Canon::Xml::Nodes::CommentNode return to_tree_from_canon_comment(node) end # Fallback to Nokogiri (legacy support) case node when Nokogiri::HTML::Document, Nokogiri::HTML4::Document, Nokogiri::HTML5::Document # Start from html element or root element root = node.at_css("html") || node.root root ? to_tree(root) : nil when Nokogiri::HTML4::DocumentFragment, Nokogiri::HTML5::DocumentFragment, Nokogiri::XML::DocumentFragment # For DocumentFragment, create a wrapper root node and add all fragment children convert_fragment(node) when Nokogiri::XML::Element convert_element(node) else raise ArgumentError, "Unsupported node type: #{node.class}" end end |