Class: BrandLogo::Strategies::ScrapingStrategy

Inherits:
BaseStrategy
  • Object
show all
Extended by:
T::Sig
Defined in:
lib/brand_logo/strategies/scraping_strategy.rb

Overview

Fetches brand logos by scraping the target website’s HTML. Tries HTTPS (with and without www), then falls back to HTTP. Delegates HTML fetching, parsing, and image analysis to injected dependencies.

Constant Summary

Constants inherited from BaseStrategy

BaseStrategy::UNKNOWN_DIMENSION_SCORE

Instance Method Summary collapse

Methods inherited from BaseStrategy

#fetch

Constructor Details

#initialize(config:, http_client:, html_parser:, image_analyzer:) ⇒ ScrapingStrategy

Returns a new instance of ScrapingStrategy.



22
23
24
25
26
27
# File 'lib/brand_logo/strategies/scraping_strategy.rb', line 22

def initialize(config:, http_client:, html_parser:, image_analyzer:)
  super(config: config)
  @http_client    = T.let(http_client, HttpClient)
  @html_parser    = T.let(html_parser, HtmlParser)
  @image_analyzer = T.let(image_analyzer, ImageAnalyzer)
end

Instance Method Details

#fetch_all(domain) ⇒ Object



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/brand_logo/strategies/scraping_strategy.rb', line 30

def fetch_all(domain)
  html, base_url = fetch_html_with_base_url(domain)
  return [] unless html

  dimensions_extractor    = Scraping::DimensionsExtractor.new(image_analyzer: @image_analyzer)
  default_favicon_checker = Scraping::DefaultFaviconChecker.new(http_client: @http_client)

  finder = Scraping::IconFinder.new(
    doc: @html_parser.parse(html),
    base_url: base_url,
    dimensions_extractor: dimensions_extractor,
    default_favicon_checker: default_favicon_checker
  )
  finder.find
rescue StandardError => e
  BrandLogo::Logging.logger.error("ScrapingStrategy error for #{domain}: #{e.message}")
  []
end