Class: SourceMonitor::Scrapers::Base
- Inherits:
-
Object
- Object
- SourceMonitor::Scrapers::Base
- Defined in:
- lib/source_monitor/scrapers/base.rb
Overview
Base class for content scrapers used by the engine.
Adapter Contract
Subclasses must implement #call and return a Result object describing the outcome of a scrape attempt. Implementations receive an item, the owning source, and a normalized settings hash that merges default adapter settings, source-level overrides, and per-invocation overrides. All adapters should remain stateless and thread-safe, relying on injected collaborators (e.g. HTTP clients) instead of global configuration.
Adapters should:
-
Perform any outbound HTTP work using the provided
httpclient. -
Populate the Result with :html and :content payloads when successful.
-
Use :status to communicate :success, :partial, or :failed outcomes.
-
Capture additional diagnostics (headers, timings, etc.) in :metadata.
Direct Known Subclasses
Defined Under Namespace
Classes: Result
Class Method Summary collapse
- .adapter_name ⇒ Object
- .call(item:, source:, settings: nil, http: SourceMonitor::HTTP) ⇒ Object
- .default_settings ⇒ Object
Instance Method Summary collapse
- #call ⇒ Object
-
#initialize(item:, source:, settings: nil, http: SourceMonitor::HTTP) ⇒ Base
constructor
A new instance of Base.
Constructor Details
#initialize(item:, source:, settings: nil, http: SourceMonitor::HTTP) ⇒ Base
Returns a new instance of Base.
41 42 43 44 45 46 |
# File 'lib/source_monitor/scrapers/base.rb', line 41 def initialize(item:, source:, settings: nil, http: SourceMonitor::HTTP) @item = item @source = source @http = http @settings = build_settings(settings) end |
Class Method Details
.adapter_name ⇒ Object
32 33 34 |
# File 'lib/source_monitor/scrapers/base.rb', line 32 def adapter_name name.demodulize.sub(/Scraper\z/, "").underscore end |
.call(item:, source:, settings: nil, http: SourceMonitor::HTTP) ⇒ Object
28 29 30 |
# File 'lib/source_monitor/scrapers/base.rb', line 28 def call(item:, source:, settings: nil, http: SourceMonitor::HTTP) new(item: item, source: source, settings: settings, http: http).call end |
.default_settings ⇒ Object
36 37 38 |
# File 'lib/source_monitor/scrapers/base.rb', line 36 def default_settings {} end |
Instance Method Details
#call ⇒ Object
48 49 50 |
# File 'lib/source_monitor/scrapers/base.rb', line 48 def call raise NotImplementedError, "#{self.class.name} must implement #call" end |