Class: Rubino::Documents::Converters::Xlsx
- Inherits:
-
Object
- Object
- Rubino::Documents::Converters::Xlsx
- Defined in:
- lib/rubino/documents/converters/xlsx.rb
Overview
XLSX (and ODS/legacy XLS where roo supports them) -> Markdown. Each sheet becomes a ‘## SheetName` heading followed by a GFM table emitted by the shared Table emitter. The `roo` gem (MIT) is OPTIONAL: #available? reports false when it can’t be required, so the registry never offers this converter on an install without roo – the caller then falls back to the shell-extraction hint.
Constant Summary collapse
- MIMES =
%w[ application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/vnd.oasis.opendocument.spreadsheet application/vnd.ms-excel ].freeze
- EXTS =
%w[.xlsx .ods .xls].freeze
- ODS_GLOBS =
OpenDocument (ODS) body globs: roo reads ‘content.xml` at the archive ROOT (and may touch other root *.xml like styles.xml/meta.xml) – NOT under xl/. Scoping the pre-open guard to xl/** alone let an ODS bomb sum to zero and slip to inflate (#350); we add the root XML read paths.
["content.xml", "*.xml"].freeze
- XLSX_GLOBS =
OOXML (xlsx) body parts live under xl/ (across ‘/`, no FNM_PATHNAME).
["xl/**"].freeze
Instance Method Summary collapse
- #accepts?(mime, path) ⇒ Boolean
- #available? ⇒ Boolean
- #convert(path, budget = Limits.null_budget) ⇒ Object
Instance Method Details
#accepts?(mime, path) ⇒ Boolean
27 28 29 30 31 |
# File 'lib/rubino/documents/converters/xlsx.rb', line 27 def accepts?(mime, path) return true if MIMES.include?(mime.to_s) EXTS.include?(File.extname(path.to_s).downcase) end |
#available? ⇒ Boolean
20 21 22 23 24 25 |
# File 'lib/rubino/documents/converters/xlsx.rb', line 20 def available? require "roo" true rescue LoadError false end |
#convert(path, budget = Limits.null_budget) ⇒ Object
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
# File 'lib/rubino/documents/converters/xlsx.rb', line 41 def convert(path, budget = Limits.null_budget) require "roo" # PRE-OPEN guard: a 400k-row spreadsheet expands its sheet/content XML # far past the on-disk cap. Sum the uncompressed sizes of the body # entries (and any nested/non-standard part a bomb could hide behind a # .rels Target) from the central directory and bail before roo inflates # them. Globs match across `/` (guard_zip! omits FNM_PATHNAME) so a deep # bomb is summed too (#337); the glob set is chosen per format so an ODS # bomb rooted at content.xml is also caught (#350). Limits.guard_zip!(path, budget, zip_globs(path)) book = Roo::Spreadsheet.open(path) parts = book.sheets.map { |name| sheet_markdown(book, name, budget) }.compact parts.join("\n\n") ensure book&.close if defined?(book) && book.respond_to?(:close) end |