Class: Rbxl::ReadOnlyWorksheet

Inherits:
Object
  • Object
show all
Defined in:
lib/rbxl/read_only_worksheet.rb

Overview

Row-by-row worksheet reader for a single sheet of a read-only workbook.

Instances are produced by Rbxl::ReadOnlyWorkbook#sheet and must not be constructed directly; their lifecycle is bound to the workbook’s ZIP handle. Rows can be consumed as Row objects or as plain value arrays depending on the iteration options.

Iteration modes

# Default: yield Rbxl::Row with cell wrappers.
sheet.each_row { |row| row.values }

# Fast path: yield plain Array<Object> of decoded values.
sheet.each_row(values_only: true) { |values| ... }

# Pad missing cells in sparse rows up to max_column.
sheet.each_row(pad_cells: true) { |row| ... }

# Replicate anchor values across merged ranges.
sheet.each_row(expand_merged: true) { |row| ... }

Iteration without a block returns an Enumerator.

Dimensions

The worksheet dimension (the A1:C10-style range) is read from the sheet’s <dimension> element when present. When absent or when the caller wants to recompute it, #calculate_dimension with force: true scans the sheet for actual cell coordinates.

Constant Summary collapse

ELEMENT_NODE =
Nokogiri::XML::Reader::TYPE_ELEMENT
TEXT_NODE =
Nokogiri::XML::Reader::TYPE_TEXT
CDATA_NODE =
Nokogiri::XML::Reader::TYPE_CDATA
END_ELEMENT_NODE =
Nokogiri::XML::Reader::TYPE_END_ELEMENT

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(zip:, entry_path:, shared_strings:, name:, streaming: false, date_styles: nil) ⇒ ReadOnlyWorksheet

Returns a new instance of ReadOnlyWorksheet.

Parameters:

  • zip (Zip::File)

    open archive shared with the workbook

  • entry_path (String)

    ZIP entry path for this sheet’s XML

  • shared_strings (Array<String>)

    pre-decoded shared strings table

  • name (String)

    visible sheet name

  • streaming (Boolean) (defaults to: false)

    when the native extension is loaded, feed worksheet XML to the parser in chunks instead of reading the entry into memory first

  • date_styles (Array<Boolean>, nil) (defaults to: nil)

    true at a style id when the id’s numFmt is a date/time format. When provided, numeric cells with a matching style are returned as Date or Time instead of Float, and the native fast path is bypassed.



62
63
64
65
66
67
68
69
70
71
72
73
# File 'lib/rbxl/read_only_worksheet.rb', line 62

def initialize(zip:, entry_path:, shared_strings:, name:, streaming: false, date_styles: nil)
  @zip = zip
  @entry_path = entry_path
  @shared_strings = shared_strings
  @name = name
  @streaming = streaming
  @date_styles = date_styles
  @disable_native = !date_styles.nil?
  @dimensions = extract_dimensions
  @merge_ranges_by_row = nil
  @merge_anchor_values = {}
end

Instance Attribute Details

#dimensionsHash{Symbol => Object}? (readonly)

Parsed dimension metadata, nil when the sheet has no <dimension> element and no scan has been forced. When present the hash has keys :ref, :max_col, and :max_row.

Returns:

  • (Hash{Symbol => Object}, nil)


49
50
51
# File 'lib/rbxl/read_only_worksheet.rb', line 49

def dimensions
  @dimensions
end

#nameString (readonly)

Returns visible sheet name.

Returns:

  • (String)

    visible sheet name



42
43
44
# File 'lib/rbxl/read_only_worksheet.rb', line 42

def name
  @name
end

Instance Method Details

#calculate_dimension(force: false) ⇒ String

Returns the worksheet dimension reference (e.g. "A1:C10").

When the sheet lacks a <dimension> element the default is to raise UnsizedWorksheetError. Passing force: true scans the sheet for the actual cell bounds instead; a sheet with no cells at all falls back to "A1:A1".

Parameters:

  • force (Boolean) (defaults to: false)

    scan the sheet when no stored dimension exists

Returns:

  • (String)

    Excel-style range reference

Raises:



148
149
150
151
152
153
154
155
156
157
# File 'lib/rbxl/read_only_worksheet.rb', line 148

def calculate_dimension(force: false)
  if dimensions
    return dimensions[:ref]
  end

  raise UnsizedWorksheetError, "worksheet is unsized, use force: true" unless force

  @dimensions = scan_dimensions
  dimensions ? dimensions[:ref] : "A1:A1"
end

#each_row(pad_cells: false, values_only: false, expand_merged: false) {|row| ... } ⇒ Enumerator, void

Iterates rows in worksheet order.

With values_only and neither pad_cells nor expand_merged set, iteration takes a tighter path that yields frozen Array<Object> rows and skips allocating cell wrappers.

Parameters:

  • pad_cells (Boolean) (defaults to: false)

    pad sparse rows with EmptyCell (or [coordinate, nil] pairs in values_only mode) up to the worksheet’s max_column

  • values_only (Boolean) (defaults to: false)

    yield plain value arrays instead of Rbxl::Row instances

  • expand_merged (Boolean) (defaults to: false)

    propagate the anchor value of every merged range across the range’s cells

Yield Parameters:

Returns:

  • (Enumerator, void)

    enumerator when called without a block



90
91
92
93
94
95
96
97
98
# File 'lib/rbxl/read_only_worksheet.rb', line 90

def each_row(pad_cells: false, values_only: false, expand_merged: false, &block)
  return enum_for(:each_row, pad_cells: pad_cells, values_only: values_only, expand_merged: expand_merged) unless block

  if values_only && !pad_cells && !expand_merged
    each_row_values_only(&block)
  else
    each_row_full(pad_cells: pad_cells, values_only: values_only, expand_merged: expand_merged, &block)
  end
end

#max_columnInteger?

Returns rightmost column index (1-based) from the worksheet dimension, or nil when dimensions are unknown.

Returns:

  • (Integer, nil)

    rightmost column index (1-based) from the worksheet dimension, or nil when dimensions are unknown



115
116
117
118
119
# File 'lib/rbxl/read_only_worksheet.rb', line 115

def max_column
  return nil unless dimensions

  dimensions[:max_col]
end

#max_rowInteger?

Returns bottom row index (1-based) from the worksheet dimension, or nil when dimensions are unknown.

Returns:

  • (Integer, nil)

    bottom row index (1-based) from the worksheet dimension, or nil when dimensions are unknown



123
124
125
126
127
# File 'lib/rbxl/read_only_worksheet.rb', line 123

def max_row
  return nil unless dimensions

  dimensions[:max_row]
end

#reset_dimensionsnil

Clears cached dimension metadata so that the next call to #calculate_dimension recomputes it.

Returns:

  • (nil)


133
134
135
# File 'lib/rbxl/read_only_worksheet.rb', line 133

def reset_dimensions
  @dimensions = nil
end

#rows(values_only: false, pad_cells: false, expand_merged: false) ⇒ Enumerator

Enumerator-returning alias for #each_row that reads more naturally when the call site chains further enumerable operations.

sheet.rows(values_only: true).take(10)

Parameters:

  • values_only (Boolean) (defaults to: false)

    see #each_row

  • pad_cells (Boolean) (defaults to: false)

    see #each_row

  • expand_merged (Boolean) (defaults to: false)

    see #each_row

Returns:

  • (Enumerator)


109
110
111
# File 'lib/rbxl/read_only_worksheet.rb', line 109

def rows(values_only: false, pad_cells: false, expand_merged: false)
  each_row(values_only: values_only, pad_cells: pad_cells, expand_merged: expand_merged)
end