Class: Rbxl::EditableWorkbook
- Inherits:
-
Object
- Object
- Rbxl::EditableWorkbook
- Defined in:
- lib/rbxl/editable_workbook.rb
Overview
Read-modify-save workbook for surgical edits to an existing .xlsx.
The design promise mirrors rbpptx: what we don’t understand, we don’t touch. The package is opened as a ZIP, each part you mutate is re-serialized, and every other entry — styles, drawings, charts, comments, pivot caches, custom XML, untouched worksheets — round-trips byte-for-byte via Zip::Entry#copy_raw_entry. Inside a worksheet you do edit, only the specific <c> element you target is rewritten; surrounding cells, the row’s other attributes, <mergeCells>, <conditionalFormatting>, <dataValidations>, and any unknown OOXML extensions remain in place. The cell’s existing s (style index) attribute is preserved, so template number formats, fonts, and fills carry through to the new value.
The editable mode is the right tool for template-style fill-ins: open a template with named cells, write a handful of values, save back. It is explicitly not the right tool for rewriting the data area of a large worksheet — the touched sheet is parsed as a Nokogiri DOM, so peak memory scales with that sheet’s on-disk size. Use the write-only mode (Rbxl.new) for that case instead.
Out of scope (1.4.0)
-
inserting / deleting / reordering / duplicating sheets
-
editing styles, formulas, named ranges, drawings, or shared strings
-
Date/Time/DateTimevalues (raise EditableCellTypeError; convert to a numeric serial yourself if you need a date cell) -
recomputing the worksheet <dimension> when a write expands the bounds
Strings on write
Cells written through this mode become inline strings (+t=“inlineStr”+), so xl/sharedStrings.xml is never mutated. Existing t=“s” cells you don’t touch keep resolving through the SST as usual; only cells you actually overwrite drop their SST reference.
Constant Summary collapse
- MAIN_NS =
Namespace for the main SpreadsheetML schema.
"http://schemas.openxmlformats.org/spreadsheetml/2006/main".freeze
- REL_NS =
Namespace used for document-level relationships.
"http://schemas.openxmlformats.org/officeDocument/2006/relationships".freeze
- PACKAGE_REL_NS =
Namespace used by the OPC package relationships layer.
"http://schemas.openxmlformats.org/package/2006/relationships".freeze
- OFFICE_DOC_REL_TYPE =
Relationship type identifying the workbook part inside
_rels/.rels. "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument".freeze
Instance Attribute Summary collapse
-
#path ⇒ String
readonly
Filesystem path the workbook was opened from.
-
#sheet_names ⇒ Array<String>
readonly
Visible sheet names in workbook order.
Class Method Summary collapse
-
.open(path) {|book| ... } ⇒ Rbxl::EditableWorkbook, Object
Convenience constructor equivalent to new(path).
Instance Method Summary collapse
-
#close ⇒ Boolean
Releases the underlying ZIP file.
- #closed? ⇒ Boolean
-
#initialize(path) ⇒ EditableWorkbook
constructor
Opens the package, validates the format, and indexes worksheet parts by visible sheet name.
-
#save(path = nil) ⇒ String
Writes the workbook out, preserving every part that has not been mutated byte-for-byte.
-
#sheet(name_or_index) ⇒ Rbxl::EditableWorksheet
Returns the editable worksheet for
name_or_index. -
#sheets {|worksheet| ... } ⇒ Enumerator<Rbxl::EditableWorksheet>
Iterates worksheets in workbook order.
Constructor Details
#initialize(path) ⇒ EditableWorkbook
Opens the package, validates the format, and indexes worksheet parts by visible sheet name. Worksheet XML is not parsed until the caller touches that sheet via #sheet.
89 90 91 92 93 94 95 96 97 98 99 100 |
# File 'lib/rbxl/editable_workbook.rb', line 89 def initialize(path) @path = path.to_s ensure_xlsx_format!(@path) @zip = Zip::File.open(@path) @closed = false @workbook_part = locate_workbook_part @workbook_dir = File.dirname(@workbook_part) @sheet_entries = load_sheet_entries @sheet_names = @sheet_entries.keys.freeze @shared_strings = nil @sheets_by_name = {} end |
Instance Attribute Details
#path ⇒ String (readonly)
Returns filesystem path the workbook was opened from.
56 57 58 |
# File 'lib/rbxl/editable_workbook.rb', line 56 def path @path end |
#sheet_names ⇒ Array<String> (readonly)
Returns visible sheet names in workbook order.
59 60 61 |
# File 'lib/rbxl/editable_workbook.rb', line 59 def sheet_names @sheet_names end |
Class Method Details
.open(path) {|book| ... } ⇒ Rbxl::EditableWorkbook, Object
Convenience constructor equivalent to new(path). When a block is given, the workbook is yielded and #close is called automatically when the block returns or raises.
69 70 71 72 73 74 75 76 77 78 |
# File 'lib/rbxl/editable_workbook.rb', line 69 def self.open(path) book = new(path) return book unless block_given? begin yield book ensure book.close end end |
Instance Method Details
#close ⇒ Boolean
Releases the underlying ZIP file. Idempotent.
190 191 192 193 194 195 196 197 |
# File 'lib/rbxl/editable_workbook.rb', line 190 def close return false if @closed @zip&.close @zip = nil @closed = true true end |
#closed? ⇒ Boolean
200 201 202 |
# File 'lib/rbxl/editable_workbook.rb', line 200 def closed? @closed end |
#save(path = nil) ⇒ String
Writes the workbook out, preserving every part that has not been mutated byte-for-byte. Worksheets whose cells have been edited are re-serialized from their in-memory Nokogiri document; all other entries (styles, sharedStrings, drawings, charts, pivot caches, custom XML, rels) are streamed straight from the source ZIP without re-parsing.
path defaults to the original load path; passing nil or omitting it saves in place. The new file is written to a temp file in the same directory and atomically renamed into place, so a crash mid-write never leaves a half-written workbook. On success, dirty flags on each touched worksheet are cleared, so the object is reusable for further edits and another #save.
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
# File 'lib/rbxl/editable_workbook.rb', line 158 def save(path = nil) ensure_open! out_path = (path || @path).to_s overrides = collect_overrides tmp_path = "#{out_path}.rbxl-tmp.#{Process.pid}.#{rand(1 << 32).to_s(16)}" begin Zip::OutputStream.open(tmp_path) do |out| @zip.each do |entry| next if entry.directory? if (override_xml = overrides[entry.name]) out.put_next_entry(entry.name) out.write(override_xml) else out.copy_raw_entry(entry) end end end File.rename(tmp_path, out_path) rescue StandardError File.unlink(tmp_path) if File.exist?(tmp_path) raise end @sheets_by_name.each_value(&:clear_dirty!) out_path end |
#sheet(name_or_index) ⇒ Rbxl::EditableWorksheet
Returns the editable worksheet for name_or_index. Repeated calls for the same sheet return the same in-memory object so edits accumulate across calls before #save.
111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
# File 'lib/rbxl/editable_workbook.rb', line 111 def sheet(name_or_index) ensure_open! name = resolve_sheet_name(name_or_index) @sheets_by_name[name] ||= EditableWorksheet.new( zip: @zip, entry_path: @sheet_entries.fetch(name) { raise SheetNotFoundError, "sheet not found: #{name}" }, workbook_path: @path, shared_strings: shared_strings, name: name ) end |
#sheets {|worksheet| ... } ⇒ Enumerator<Rbxl::EditableWorksheet>
Iterates worksheets in workbook order. Worksheets are constructed on demand and memoized, so iterating then editing is consistent with #sheet.
133 134 135 136 137 138 |
# File 'lib/rbxl/editable_workbook.rb', line 133 def sheets ensure_open! return enum_for(:sheets) unless block_given? @sheet_names.each { |name| yield sheet(name) } end |