Class: Html2rss::HtmlExtractor::Extractors::Media

Inherits:
Object
  • Object
show all
Defined in:
lib/html2rss/html_extractor/enclosure_extractor.rb

Overview

Extracts media enclosures (video/audio) from HTML tags.

Class Method Summary collapse

Class Method Details

.call(article_tag, base_url:) ⇒ Array<Hash{Symbol => Object}>

Returns media enclosure hashes.

Parameters:

  • article_tag (Nokogiri::XML::Element)

    article container node

  • base_url (String, Html2rss::Url)

    base URL for relative media sources

Returns:

  • (Array<Hash{Symbol => Object}>)

    media enclosure hashes



49
50
51
52
53
54
55
56
57
58
59
# File 'lib/html2rss/html_extractor/enclosure_extractor.rb', line 49

def self.call(, base_url:)
  .css('video source[src], audio source[src], audio[src]').filter_map do |element|
    src = element['src'].to_s
    next if src.empty?

    {
      url: Url.from_relative(src, base_url),
      type: element['type']
    }
  end
end