Class: Rubino::Documents::Converters::Html

Inherits:
Object
  • Object
show all
Defined in:
lib/rubino/documents/converters/html.rb

Overview

HTML / XHTML -> Markdown via the shared HTML core (Documents::Html). Thin by design: read the file, hand the bytes to the core. This is the engine the other shaped-as-HTML converters reuse.

Instance Method Summary collapse

Instance Method Details

#accepts?(mime, path) ⇒ Boolean

Returns:

  • (Boolean)


14
15
16
17
18
19
# File 'lib/rubino/documents/converters/html.rb', line 14

def accepts?(mime, path)
  m = mime.to_s
  return true if ["text/html", "application/xhtml+xml"].include?(m)

  %w[.html .htm .xhtml].include?(File.extname(path.to_s).downcase)
end

#available?Boolean

Returns:

  • (Boolean)


10
11
12
# File 'lib/rubino/documents/converters/html.rb', line 10

def available?
  true
end

#convert(path, budget = Limits.null_budget) ⇒ Object



21
22
23
24
25
26
27
28
# File 'lib/rubino/documents/converters/html.rb', line 21

def convert(path, budget = Limits.null_budget)
  raw = File.read(path, encoding: "bom|utf-8")
  # Enforce the decompressed-bytes ceiling BEFORE kramdown parses the
  # whole tree: a deeply-nested / huge HTML is the html-equivalent of an
  # expand bomb. Over the cap -> CapExceeded -> shell-hint.
  budget.add_bytes(raw.bytesize)
  Documents::Html.to_markdown(raw)
end