Class: SourceMonitor::Scrapers::Base

Inherits:
Object
  • Object
show all
Defined in:
lib/source_monitor/scrapers/base.rb

Overview

Base class for content scrapers used by the engine.

Adapter Contract

Subclasses must implement #call and return a Result object describing the outcome of a scrape attempt. Implementations receive an item, the owning source, and a normalized settings hash that merges default adapter settings, source-level overrides, and per-invocation overrides. All adapters should remain stateless and thread-safe, relying on injected collaborators (e.g. HTTP clients) instead of global configuration.

Adapters should:

  • Perform any outbound HTTP work using the provided http client.

  • Populate the Result with :html and :content payloads when successful.

  • Use :status to communicate :success, :partial, or :failed outcomes.

  • Capture additional diagnostics (headers, timings, etc.) in :metadata.

Direct Known Subclasses

Readability

Defined Under Namespace

Classes: Result

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(item:, source:, settings: nil, http: SourceMonitor::HTTP) ⇒ Base

Returns a new instance of Base.



41
42
43
44
45
46
# File 'lib/source_monitor/scrapers/base.rb', line 41

def initialize(item:, source:, settings: nil, http: SourceMonitor::HTTP)
  @item = item
  @source = source
  @http = http
  @settings = build_settings(settings)
end

Class Method Details

.adapter_nameObject



32
33
34
# File 'lib/source_monitor/scrapers/base.rb', line 32

def adapter_name
  name.demodulize.sub(/Scraper\z/, "").underscore
end

.call(item:, source:, settings: nil, http: SourceMonitor::HTTP) ⇒ Object



28
29
30
# File 'lib/source_monitor/scrapers/base.rb', line 28

def call(item:, source:, settings: nil, http: SourceMonitor::HTTP)
  new(item: item, source: source, settings: settings, http: http).call
end

.default_settingsObject



36
37
38
# File 'lib/source_monitor/scrapers/base.rb', line 36

def default_settings
  {}
end

Instance Method Details

#callObject

Raises:

  • (NotImplementedError)


48
49
50
# File 'lib/source_monitor/scrapers/base.rb', line 48

def call
  raise NotImplementedError, "#{self.class.name} must implement #call"
end