Class: Rbxl::EditableWorkbook

Inherits:
Object
  • Object
show all
Defined in:
lib/rbxl/editable_workbook.rb

Overview

Read-modify-save workbook for surgical edits to an existing .xlsx.

The design promise mirrors rbpptx: what we don’t understand, we don’t touch. The package is opened as a ZIP, each part you mutate is re-serialized, and every other entry — styles, drawings, charts, comments, pivot caches, custom XML, untouched worksheets — round-trips byte-for-byte via Zip::Entry#copy_raw_entry. Inside a worksheet you do edit, only the specific <c> element you target is rewritten; surrounding cells, the row’s other attributes, <mergeCells>, <conditionalFormatting>, <dataValidations>, and any unknown OOXML extensions remain in place. The cell’s existing s (style index) attribute is preserved, so template number formats, fonts, and fills carry through to the new value.

The editable mode is the right tool for template-style fill-ins: open a template with named cells, write a handful of values, save back. It is explicitly not the right tool for rewriting the data area of a large worksheet — the touched sheet is parsed as a Nokogiri DOM, so peak memory scales with that sheet’s on-disk size. Use the write-only mode (Rbxl.new) for that case instead.

Out of scope (1.4.0)

  • inserting / deleting / reordering / duplicating sheets

  • editing styles, formulas, named ranges, drawings, or shared strings

  • Date / Time / DateTime values (raise EditableCellTypeError; convert to a numeric serial yourself if you need a date cell)

  • recomputing the worksheet <dimension> when a write expands the bounds

Strings on write

Cells written through this mode become inline strings (+t=“inlineStr”+), so xl/sharedStrings.xml is never mutated. Existing t=“s” cells you don’t touch keep resolving through the SST as usual; only cells you actually overwrite drop their SST reference.

Constant Summary collapse

MAIN_NS =

Namespace for the main SpreadsheetML schema.

"http://schemas.openxmlformats.org/spreadsheetml/2006/main".freeze
REL_NS =

Namespace used for document-level relationships.

"http://schemas.openxmlformats.org/officeDocument/2006/relationships".freeze
PACKAGE_REL_NS =

Namespace used by the OPC package relationships layer.

"http://schemas.openxmlformats.org/package/2006/relationships".freeze
OFFICE_DOC_REL_TYPE =

Relationship type identifying the workbook part inside _rels/.rels.

"http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument".freeze

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(path) ⇒ EditableWorkbook

Opens the package, validates the format, and indexes worksheet parts by visible sheet name. Worksheet XML is not parsed until the caller touches that sheet via #sheet.

Parameters:

  • path (String, #to_path)

    path to the .xlsx file

Raises:



89
90
91
92
93
94
95
96
97
98
99
100
# File 'lib/rbxl/editable_workbook.rb', line 89

def initialize(path)
  @path = path.to_s
  ensure_xlsx_format!(@path)
  @zip = Zip::File.open(@path)
  @closed = false
  @workbook_part = locate_workbook_part
  @workbook_dir = File.dirname(@workbook_part)
  @sheet_entries = load_sheet_entries
  @sheet_names = @sheet_entries.keys.freeze
  @shared_strings = nil
  @sheets_by_name = {}
end

Instance Attribute Details

#pathString (readonly)

Returns filesystem path the workbook was opened from.

Returns:

  • (String)

    filesystem path the workbook was opened from



56
57
58
# File 'lib/rbxl/editable_workbook.rb', line 56

def path
  @path
end

#sheet_namesArray<String> (readonly)

Returns visible sheet names in workbook order.

Returns:

  • (Array<String>)

    visible sheet names in workbook order



59
60
61
# File 'lib/rbxl/editable_workbook.rb', line 59

def sheet_names
  @sheet_names
end

Class Method Details

.open(path) {|book| ... } ⇒ Rbxl::EditableWorkbook, Object

Convenience constructor equivalent to new(path). When a block is given, the workbook is yielded and #close is called automatically when the block returns or raises.

Parameters:

  • path (String, #to_path)

Yield Parameters:

Returns:

  • (Rbxl::EditableWorkbook, Object)

    the workbook when no block is given, otherwise the block’s return value



69
70
71
72
73
74
75
76
77
78
# File 'lib/rbxl/editable_workbook.rb', line 69

def self.open(path)
  book = new(path)
  return book unless block_given?

  begin
    yield book
  ensure
    book.close
  end
end

Instance Method Details

#closeBoolean

Releases the underlying ZIP file. Idempotent.

Returns:

  • (Boolean)

    true on the first call, false on subsequent calls



190
191
192
193
194
195
196
197
# File 'lib/rbxl/editable_workbook.rb', line 190

def close
  return false if @closed

  @zip&.close
  @zip = nil
  @closed = true
  true
end

#closed?Boolean

Returns:

  • (Boolean)


200
201
202
# File 'lib/rbxl/editable_workbook.rb', line 200

def closed?
  @closed
end

#save(path = nil) ⇒ String

Writes the workbook out, preserving every part that has not been mutated byte-for-byte. Worksheets whose cells have been edited are re-serialized from their in-memory Nokogiri document; all other entries (styles, sharedStrings, drawings, charts, pivot caches, custom XML, rels) are streamed straight from the source ZIP without re-parsing.

path defaults to the original load path; passing nil or omitting it saves in place. The new file is written to a temp file in the same directory and atomically renamed into place, so a crash mid-write never leaves a half-written workbook. On success, dirty flags on each touched worksheet are cleared, so the object is reusable for further edits and another #save.

Parameters:

  • path (String, #to_path, nil) (defaults to: nil)

    destination path; defaults to the path the workbook was opened from

Returns:

  • (String)

    the path that was written

Raises:



158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
# File 'lib/rbxl/editable_workbook.rb', line 158

def save(path = nil)
  ensure_open!
  out_path = (path || @path).to_s
  overrides = collect_overrides

  tmp_path = "#{out_path}.rbxl-tmp.#{Process.pid}.#{rand(1 << 32).to_s(16)}"
  begin
    Zip::OutputStream.open(tmp_path) do |out|
      @zip.each do |entry|
        next if entry.directory?

        if (override_xml = overrides[entry.name])
          out.put_next_entry(entry.name)
          out.write(override_xml)
        else
          out.copy_raw_entry(entry)
        end
      end
    end
    File.rename(tmp_path, out_path)
  rescue StandardError
    File.unlink(tmp_path) if File.exist?(tmp_path)
    raise
  end

  @sheets_by_name.each_value(&:clear_dirty!)
  out_path
end

#sheet(name_or_index) ⇒ Rbxl::EditableWorksheet

Returns the editable worksheet for name_or_index. Repeated calls for the same sheet return the same in-memory object so edits accumulate across calls before #save.

Parameters:

  • name_or_index (String, Integer)

    visible sheet name as listed in #sheet_names, or an integer index (negatives count from the end)

Returns:

Raises:



111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/rbxl/editable_workbook.rb', line 111

def sheet(name_or_index)
  ensure_open!

  name = resolve_sheet_name(name_or_index)
  @sheets_by_name[name] ||= EditableWorksheet.new(
    zip: @zip,
    entry_path: @sheet_entries.fetch(name) {
      raise SheetNotFoundError, "sheet not found: #{name}"
    },
    workbook_path: @path,
    shared_strings: shared_strings,
    name: name
  )
end

#sheets {|worksheet| ... } ⇒ Enumerator<Rbxl::EditableWorksheet>

Iterates worksheets in workbook order. Worksheets are constructed on demand and memoized, so iterating then editing is consistent with #sheet.

Yield Parameters:

Returns:

Raises:



133
134
135
136
137
138
# File 'lib/rbxl/editable_workbook.rb', line 133

def sheets
  ensure_open!
  return enum_for(:sheets) unless block_given?

  @sheet_names.each { |name| yield sheet(name) }
end