Class: Browsable::HtmlExtractor
- Inherits:
-
Object
- Object
- Browsable::HtmlExtractor
- Defined in:
- lib/browsable/html_extractor.rb
Overview
Pure-Ruby parser for a rendered HTML response. Walks the document for asset references (‘<link rel=“stylesheet”>`, `<script src>`) and inline CSS/JS blocks, then asks the configured AssetResolver to translate each external URL into an on-disk path.
This is the only HTML work the runtime middleware performs per request. No analysis happens here — that is the TestReport’s job, end of suite.
Defined Under Namespace
Classes: AssetRef, Extraction, InlineBlock
Constant Summary collapse
- EMPTY =
Extraction.new(asset_paths: [], inline_blocks: []).freeze
Instance Attribute Summary collapse
-
#asset_resolver ⇒ Object
readonly
Returns the value of attribute asset_resolver.
-
#html ⇒ Object
readonly
Returns the value of attribute html.
Class Method Summary collapse
-
.extract(html, asset_resolver: nil) ⇒ Object
Convenience entry point used by the middleware so a single call replaces both ‘new(…)` and `.extract` at the call site.
Instance Method Summary collapse
-
#initialize(html, asset_resolver: nil) ⇒ HtmlExtractor
constructor
A new instance of HtmlExtractor.
- #run ⇒ Object
Constructor Details
#initialize(html, asset_resolver: nil) ⇒ HtmlExtractor
Returns a new instance of HtmlExtractor.
35 36 37 38 |
# File 'lib/browsable/html_extractor.rb', line 35 def initialize(html, asset_resolver: nil) @html = html.to_s @asset_resolver = asset_resolver end |
Instance Attribute Details
#asset_resolver ⇒ Object (readonly)
Returns the value of attribute asset_resolver.
33 34 35 |
# File 'lib/browsable/html_extractor.rb', line 33 def asset_resolver @asset_resolver end |
#html ⇒ Object (readonly)
Returns the value of attribute html.
33 34 35 |
# File 'lib/browsable/html_extractor.rb', line 33 def html @html end |
Class Method Details
.extract(html, asset_resolver: nil) ⇒ Object
Convenience entry point used by the middleware so a single call replaces both ‘new(…)` and `.extract` at the call site.
42 43 44 |
# File 'lib/browsable/html_extractor.rb', line 42 def self.extract(html, asset_resolver: nil) new(html, asset_resolver: asset_resolver).run end |
Instance Method Details
#run ⇒ Object
46 47 48 49 50 51 52 53 54 55 56 |
# File 'lib/browsable/html_extractor.rb', line 46 def run return EMPTY if html.strip.empty? doc = Nokogiri::HTML5.parse(html) Extraction.new( asset_paths: extract_assets(doc), inline_blocks: extract_inline_blocks(doc) ) rescue StandardError EMPTY end |