Class: Coelacanth::Extractor::Normalizer

Inherits:
Object
  • Object
show all
Defined in:
lib/coelacanth/extractor/normalizer.rb

Overview

Sanitizes HTML and prepares an Oga document.

Constant Summary collapse

REMOVABLE_SELECTORS =
%w[style noscript iframe form nav].freeze

Instance Method Summary collapse

Instance Method Details

#call(html:, base_url: nil) ⇒ Object



13
14
15
16
17
18
# File 'lib/coelacanth/extractor/normalizer.rb', line 13

def call(html:, base_url: nil)
  document = Oga.parse_html(html)
  remove_noise(document)
  normalize_images(document, base_url)
  document
end