Class: Rubino::Documents::Converters::Pptx

Inherits:
Object
  • Object
show all
Defined in:
lib/rubino/documents/converters/pptx.rb

Overview

PPTX -> Markdown via the ‘ruby_powerpoint` gem (MIT, OPTIONAL). Each slide becomes a `## Slide N` heading; the slide’s text frames become paragraphs/bullets and speaker notes go under a ‘>` block quote. The gem gives us text per slide (and notes); it does not preserve shape geometry, so we emit text in document order – good enough for an LLM to read.

Constant Summary collapse

MIMES =
%w[
  application/vnd.openxmlformats-officedocument.presentationml.presentation
].freeze

Instance Method Summary collapse

Instance Method Details

#accepts?(mime, path) ⇒ Boolean

Returns:

  • (Boolean)


23
24
25
26
27
# File 'lib/rubino/documents/converters/pptx.rb', line 23

def accepts?(mime, path)
  return true if MIMES.include?(mime.to_s)

  File.extname(path.to_s).downcase == ".pptx"
end

#available?Boolean

Returns:

  • (Boolean)


16
17
18
19
20
21
# File 'lib/rubino/documents/converters/pptx.rb', line 16

def available?
  require "ruby_powerpoint"
  true
rescue LoadError
  false
end

#convert(path, budget = Limits.null_budget) ⇒ Object



29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/rubino/documents/converters/pptx.rb', line 29

def convert(path, budget = Limits.null_budget)
  require "ruby_powerpoint"
  # PRE-OPEN guard against a slide/text zip-expand bomb (see Docx). Sum
  # EVERY entry under ppt/ -- including a bomb hidden at a nested/non-
  # standard path behind a .rels Target. `ppt/**` matches across `/`
  # (guard_zip! globs without FNM_PATHNAME) so a deep bomb is caught (#337).
  Limits.guard_zip!(path, budget, ["ppt/**"])
  ppt = RubyPowerpoint::Presentation.new(path)
  parts = ppt.slides.each_with_index.map do |slide, i|
    md = slide_markdown(slide, i + 1)
    budget.tick(bytes: md.to_s.bytesize)
    md
  end
  parts.compact.join("\n\n")
end