Class: Rbxl::ReadOnlyWorksheet

Inherits:
Object
  • Object
show all
Defined in:
lib/rbxl/read_only_worksheet.rb

Overview

Row-by-row worksheet reader for a single sheet of a read-only workbook.

Instances are produced by Rbxl::ReadOnlyWorkbook#sheet and must not be constructed directly; their lifecycle is bound to the workbook’s ZIP handle. Rows can be consumed as Row objects or as plain value arrays depending on the iteration options.

Iteration modes

# Default: yield Rbxl::Row with cell wrappers.
sheet.each_row { |row| row.values }

# Fast path: yield plain Array<Object> of decoded values.
sheet.each_row(values_only: true) { |values| ... }

# Pad missing cells in sparse rows up to max_column.
sheet.each_row(pad_cells: true) { |row| ... }

# Replicate anchor values across merged ranges.
sheet.each_row(expand_merged: true) { |row| ... }

Iteration without a block returns an Enumerator.

Dimensions

The worksheet dimension (the A1:C10-style range) is read from the sheet’s <dimension> element when present. When absent or when the caller wants to recompute it, #calculate_dimension with force: true scans the sheet for actual cell coordinates.

Constant Summary collapse

ELEMENT_NODE =
Nokogiri::XML::Reader::TYPE_ELEMENT
TEXT_NODE =
Nokogiri::XML::Reader::TYPE_TEXT
CDATA_NODE =
Nokogiri::XML::Reader::TYPE_CDATA
END_ELEMENT_NODE =
Nokogiri::XML::Reader::TYPE_END_ELEMENT

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(zip:, entry_path:, shared_strings:, name:, streaming: false) ⇒ ReadOnlyWorksheet

Returns a new instance of ReadOnlyWorksheet.

Parameters:

  • zip (Zip::File)

    open archive shared with the workbook

  • entry_path (String)

    ZIP entry path for this sheet’s XML

  • shared_strings (Array<String>)

    pre-decoded shared strings table

  • name (String)

    visible sheet name

  • streaming (Boolean) (defaults to: false)

    when the native extension is loaded, feed worksheet XML to the parser in chunks instead of reading the entry into memory first



58
59
60
61
62
63
64
65
66
67
# File 'lib/rbxl/read_only_worksheet.rb', line 58

def initialize(zip:, entry_path:, shared_strings:, name:, streaming: false)
  @zip = zip
  @entry_path = entry_path
  @shared_strings = shared_strings
  @name = name
  @streaming = streaming
  @dimensions = extract_dimensions
  @merge_ranges_by_row = nil
  @merge_anchor_values = {}
end

Instance Attribute Details

#dimensionsHash{Symbol => Object}? (readonly)

Parsed dimension metadata, nil when the sheet has no <dimension> element and no scan has been forced. When present the hash has keys :ref, :max_col, and :max_row.

Returns:

  • (Hash{Symbol => Object}, nil)


49
50
51
# File 'lib/rbxl/read_only_worksheet.rb', line 49

def dimensions
  @dimensions
end

#nameString (readonly)

Returns visible sheet name.

Returns:

  • (String)

    visible sheet name



42
43
44
# File 'lib/rbxl/read_only_worksheet.rb', line 42

def name
  @name
end

Instance Method Details

#calculate_dimension(force: false) ⇒ String

Returns the worksheet dimension reference (e.g. "A1:C10").

When the sheet lacks a <dimension> element the default is to raise UnsizedWorksheetError. Passing force: true scans the sheet for the actual cell bounds instead; a sheet with no cells at all falls back to "A1:A1".

Parameters:

  • force (Boolean) (defaults to: false)

    scan the sheet when no stored dimension exists

Returns:

  • (String)

    Excel-style range reference

Raises:



142
143
144
145
146
147
148
149
150
151
# File 'lib/rbxl/read_only_worksheet.rb', line 142

def calculate_dimension(force: false)
  if dimensions
    return dimensions[:ref]
  end

  raise UnsizedWorksheetError, "worksheet is unsized, use force: true" unless force

  @dimensions = scan_dimensions
  dimensions ? dimensions[:ref] : "A1:A1"
end

#each_row(pad_cells: false, values_only: false, expand_merged: false) {|row| ... } ⇒ Enumerator, void

Iterates rows in worksheet order.

With values_only and neither pad_cells nor expand_merged set, iteration takes a tighter path that yields frozen Array<Object> rows and skips allocating cell wrappers.

Parameters:

  • pad_cells (Boolean) (defaults to: false)

    pad sparse rows with EmptyCell (or [coordinate, nil] pairs in values_only mode) up to the worksheet’s max_column

  • values_only (Boolean) (defaults to: false)

    yield plain value arrays instead of Rbxl::Row instances

  • expand_merged (Boolean) (defaults to: false)

    propagate the anchor value of every merged range across the range’s cells

Yield Parameters:

Returns:

  • (Enumerator, void)

    enumerator when called without a block



84
85
86
87
88
89
90
91
92
# File 'lib/rbxl/read_only_worksheet.rb', line 84

def each_row(pad_cells: false, values_only: false, expand_merged: false, &block)
  return enum_for(:each_row, pad_cells: pad_cells, values_only: values_only, expand_merged: expand_merged) unless block

  if values_only && !pad_cells && !expand_merged
    each_row_values_only(&block)
  else
    each_row_full(pad_cells: pad_cells, values_only: values_only, expand_merged: expand_merged, &block)
  end
end

#max_columnInteger?

Returns rightmost column index (1-based) from the worksheet dimension, or nil when dimensions are unknown.

Returns:

  • (Integer, nil)

    rightmost column index (1-based) from the worksheet dimension, or nil when dimensions are unknown



109
110
111
112
113
# File 'lib/rbxl/read_only_worksheet.rb', line 109

def max_column
  return nil unless dimensions

  dimensions[:max_col]
end

#max_rowInteger?

Returns bottom row index (1-based) from the worksheet dimension, or nil when dimensions are unknown.

Returns:

  • (Integer, nil)

    bottom row index (1-based) from the worksheet dimension, or nil when dimensions are unknown



117
118
119
120
121
# File 'lib/rbxl/read_only_worksheet.rb', line 117

def max_row
  return nil unless dimensions

  dimensions[:max_row]
end

#reset_dimensionsnil

Clears cached dimension metadata so that the next call to #calculate_dimension recomputes it.

Returns:

  • (nil)


127
128
129
# File 'lib/rbxl/read_only_worksheet.rb', line 127

def reset_dimensions
  @dimensions = nil
end

#rows(values_only: false, pad_cells: false, expand_merged: false) ⇒ Enumerator

Enumerator-returning alias for #each_row that reads more naturally when the call site chains further enumerable operations.

sheet.rows(values_only: true).take(10)

Parameters:

  • values_only (Boolean) (defaults to: false)

    see #each_row

  • pad_cells (Boolean) (defaults to: false)

    see #each_row

  • expand_merged (Boolean) (defaults to: false)

    see #each_row

Returns:

  • (Enumerator)


103
104
105
# File 'lib/rbxl/read_only_worksheet.rb', line 103

def rows(values_only: false, pad_cells: false, expand_merged: false)
  each_row(values_only: values_only, pad_cells: pad_cells, expand_merged: expand_merged)
end