Class: Scrapetor::TextNode
- Inherits:
-
String
- Object
- String
- Scrapetor::TextNode
- Defined in:
- lib/scrapetor/text_node.rb
Overview
Result type for ‘::text` and `::attr(name)` pseudo-element queries.
Scrapy / Parsel-style code expects strings directly from these selectors (‘doc.css(“h3::text”).get`), but Nokogiri-style scrapers routinely chain a `.text` / `.content` accessor onto each result (`doc.css(“h3::text”).first.text` or `node.at(“a::attr(href)”).text`). Returning a bare String breaks the Nokogiri-style call path with NoMethodError, even though the String already is the text we would have returned.
TextNode is a thin String subclass that closes the gap: it equals, compares, splits, and concatenates exactly like a String, and adds the Node-shaped accessors (‘text`, `content`, `inner_text`, `name`, `element?`, `text?`) plus the Parsel-shaped `get` / `getall`. The underlying byte string is the actual text content; the extra methods all return self (or trivial derivatives), so chaining stays cheap.
Instance Attribute Summary collapse
-
#parent_node ⇒ Object
Containing element (the node whose text/attribute this TextNode represents).
Instance Method Summary collapse
- #[](*args) ⇒ Object
- #[]=(*_args) ⇒ Object
- #add_class(_k) ⇒ Object
- #at(_selector) ⇒ Object
- #at_css(_selector) ⇒ Object
- #at_xpath(*_args) ⇒ Object
- #attribute(_name) ⇒ Object
- #attribute_nodes ⇒ Object
- #attributes ⇒ Object
- #cdata? ⇒ Boolean
- #children ⇒ Object
- #classes ⇒ Object
- #comment? ⇒ Boolean
- #content=(_v) ⇒ Object
- #css(_selector) ⇒ Object
- #document? ⇒ Boolean
- #element? ⇒ Boolean
- #element_children ⇒ Object
-
#get ⇒ Object
Parsel-style accessors.
- #getall ⇒ Object
- #has_class?(_klass) ⇒ Boolean
-
#inner_html=(_v) ⇒ Object
No-op mutation API.
- #inspect ⇒ Object
- #keys ⇒ Object
-
#name ⇒ Object
Node-shape predicates so duck-typing checks (‘n.element?`, `n.text?`, `n.name == “#text”`) don’t blow up.
- #next_element_sibling ⇒ Object
- #next_sibling ⇒ Object
- #parent ⇒ Object
- #previous_element_sibling ⇒ Object
- #previous_sibling ⇒ Object
- #remove ⇒ Object
- #remove_class(*_) ⇒ Object
- #search(_selector) ⇒ Object
- #text ⇒ Object (also: #inner_text, #content)
- #text? ⇒ Boolean
- #to_html ⇒ Object (also: #outer_html, #inner_html)
- #unlink ⇒ Object
- #values ⇒ Object
- #xpath(*_args) ⇒ Object
Instance Attribute Details
#parent_node ⇒ Object
Containing element (the node whose text/attribute this TextNode represents). Set by the css() boundary when we know the parent; left nil otherwise. Production code chains ‘result.at(::text).parent.css(…)` to navigate to siblings of the text node, mirroring the Nokogiri shape where text nodes carry a `.parent` back-reference.
63 64 65 |
# File 'lib/scrapetor/text_node.rb', line 63 def parent_node @parent_node end |
Instance Method Details
#[](*args) ⇒ Object
79 80 81 82 83 84 85 86 87 88 89 |
# File 'lib/scrapetor/text_node.rb', line 79 def [](*args) # String byte/range subscript when called with a single non-string # argument; nil for attribute-style String access. if args.size == 1 && args.first.is_a?(String) nil elsif args.size == 1 && args.first.is_a?(Symbol) nil else super end end |
#[]=(*_args) ⇒ Object
51 |
# File 'lib/scrapetor/text_node.rb', line 51 def []=(*_args); nil; end |
#add_class(_k) ⇒ Object
52 |
# File 'lib/scrapetor/text_node.rb', line 52 def add_class(_k); self; end |
#at(_selector) ⇒ Object
92 |
# File 'lib/scrapetor/text_node.rb', line 92 def at(_selector); nil; end |
#at_css(_selector) ⇒ Object
91 |
# File 'lib/scrapetor/text_node.rb', line 91 def at_css(_selector); nil; end |
#at_xpath(*_args) ⇒ Object
95 |
# File 'lib/scrapetor/text_node.rb', line 95 def at_xpath(*_args); nil; end |
#attribute(_name) ⇒ Object
74 |
# File 'lib/scrapetor/text_node.rb', line 74 def attribute(_name); nil; end |
#attribute_nodes ⇒ Object
73 |
# File 'lib/scrapetor/text_node.rb', line 73 def attribute_nodes; []; end |
#attributes ⇒ Object
72 |
# File 'lib/scrapetor/text_node.rb', line 72 def attributes; {}; end |
#cdata? ⇒ Boolean
36 |
# File 'lib/scrapetor/text_node.rb', line 36 def cdata?; false; end |
#children ⇒ Object
70 |
# File 'lib/scrapetor/text_node.rb', line 70 def children; []; end |
#classes ⇒ Object
77 |
# File 'lib/scrapetor/text_node.rb', line 77 def classes; []; end |
#comment? ⇒ Boolean
34 |
# File 'lib/scrapetor/text_node.rb', line 34 def comment?; false; end |
#content=(_v) ⇒ Object
50 |
# File 'lib/scrapetor/text_node.rb', line 50 def content=(_v); _v; end |
#css(_selector) ⇒ Object
90 |
# File 'lib/scrapetor/text_node.rb', line 90 def css(_selector); []; end |
#document? ⇒ Boolean
35 |
# File 'lib/scrapetor/text_node.rb', line 35 def document?; false; end |
#element? ⇒ Boolean
32 |
# File 'lib/scrapetor/text_node.rb', line 32 def element?; false; end |
#element_children ⇒ Object
71 |
# File 'lib/scrapetor/text_node.rb', line 71 def element_children; []; end |
#get ⇒ Object
Parsel-style accessors.
26 |
# File 'lib/scrapetor/text_node.rb', line 26 def get; String.new(self); end |
#getall ⇒ Object
27 |
# File 'lib/scrapetor/text_node.rb', line 27 def getall; [String.new(self)]; end |
#has_class?(_klass) ⇒ Boolean
78 |
# File 'lib/scrapetor/text_node.rb', line 78 def has_class?(_klass); false; end |
#inner_html=(_v) ⇒ Object
No-op mutation API. Heterogeneous selectors like ‘.foo > ::text, .bar` can hand a TextNode to a caller that assumes an Element interface (e.g. `node.inner_html = node.inner_html.gsub(…)`). The reassignment would crash on bare String; we accept the write silently so the subsequent `.text` read still works. The mutation is intentionally dropped — TextNode wraps frozen content of the original element.
49 |
# File 'lib/scrapetor/text_node.rb', line 49 def inner_html=(_v); _v; end |
#inspect ⇒ Object
97 98 99 |
# File 'lib/scrapetor/text_node.rb', line 97 def inspect "#<Scrapetor::TextNode #{super}>" end |
#keys ⇒ Object
75 |
# File 'lib/scrapetor/text_node.rb', line 75 def keys; []; end |
#name ⇒ Object
Node-shape predicates so duck-typing checks (‘n.element?`, `n.text?`, `n.name == “#text”`) don’t blow up.
31 |
# File 'lib/scrapetor/text_node.rb', line 31 def name; "#text"; end |
#next_element_sibling ⇒ Object
68 |
# File 'lib/scrapetor/text_node.rb', line 68 def next_element_sibling; nil; end |
#next_sibling ⇒ Object
66 |
# File 'lib/scrapetor/text_node.rb', line 66 def next_sibling; nil; end |
#parent ⇒ Object
65 |
# File 'lib/scrapetor/text_node.rb', line 65 def parent; @parent_node; end |
#previous_element_sibling ⇒ Object
69 |
# File 'lib/scrapetor/text_node.rb', line 69 def previous_element_sibling; nil; end |
#previous_sibling ⇒ Object
67 |
# File 'lib/scrapetor/text_node.rb', line 67 def previous_sibling; nil; end |
#remove ⇒ Object
54 |
# File 'lib/scrapetor/text_node.rb', line 54 def remove; self; end |
#remove_class(*_) ⇒ Object
53 |
# File 'lib/scrapetor/text_node.rb', line 53 def remove_class(*_); self; end |
#search(_selector) ⇒ Object
93 |
# File 'lib/scrapetor/text_node.rb', line 93 def search(_selector); []; end |
#text ⇒ Object Also known as: inner_text, content
21 |
# File 'lib/scrapetor/text_node.rb', line 21 def text; String.new(self); end |
#text? ⇒ Boolean
33 |
# File 'lib/scrapetor/text_node.rb', line 33 def text?; true; end |
#to_html ⇒ Object Also known as: outer_html, inner_html
38 |
# File 'lib/scrapetor/text_node.rb', line 38 def to_html; self.to_s; end |
#unlink ⇒ Object
55 |
# File 'lib/scrapetor/text_node.rb', line 55 def unlink; self; end |
#values ⇒ Object
76 |
# File 'lib/scrapetor/text_node.rb', line 76 def values; []; end |
#xpath(*_args) ⇒ Object
94 |
# File 'lib/scrapetor/text_node.rb', line 94 def xpath(*_args); []; end |