Class: SharedTools::Tools::Doc::SpreadsheetReaderTool

Inherits:
RubyLLM::Tool
  • Object
show all
Defined in:
lib/shared_tools/tools/doc/spreadsheet_reader_tool.rb

Overview

Read spreadsheet data from CSV, XLSX, ODS, and other formats supported by the roo gem.

Examples:

Read all rows from a CSV

tool = SharedTools::Tools::Doc::SpreadsheetReaderTool.new
tool.execute(doc_path: "./data.csv")

Read a specific sheet from an Excel workbook

tool.execute(doc_path: "./report.xlsx", sheet: "Q1 Sales")

Read a row range from a worksheet

tool.execute(doc_path: "./report.xlsx", sheet: "Summary", row_range: "2-50")

Constant Summary collapse

SUPPORTED_FORMATS =
%w[.csv .xlsx .ods .xlsm].freeze

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(logger: nil) ⇒ SpreadsheetReaderTool

Returns a new instance of SpreadsheetReaderTool.

Parameters:

  • logger (Logger) (defaults to: nil)

    optional logger



66
67
68
# File 'lib/shared_tools/tools/doc/spreadsheet_reader_tool.rb', line 66

def initialize(logger: nil)
  @logger = logger || RubyLLM.logger
end

Class Method Details

.nameObject



24
# File 'lib/shared_tools/tools/doc/spreadsheet_reader_tool.rb', line 24

def self.name = 'doc_spreadsheet_read'

Instance Method Details

#execute(doc_path:, sheet: nil, row_range: nil, headers: true) ⇒ Hash

Returns extraction result.

Parameters:

  • doc_path (String)

    path to spreadsheet file

  • sheet (String, nil) (defaults to: nil)

    sheet name or 1-based index

  • row_range (String, nil) (defaults to: nil)

    row range to extract

  • headers (Boolean) (defaults to: true)

    whether first row is headers

Returns:

  • (Hash)

    extraction result



75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
# File 'lib/shared_tools/tools/doc/spreadsheet_reader_tool.rb', line 75

def execute(doc_path:, sheet: nil, row_range: nil, headers: true)
  raise LoadError, "SpreadsheetReaderTool requires the 'roo' gem. Install it with: gem install roo" unless defined?(Roo)

  @logger.info("SpreadsheetReaderTool#execute doc_path=#{doc_path} sheet=#{sheet} row_range=#{row_range}")

  return { error: "File not found: #{doc_path}" } unless File.exist?(doc_path)

  ext = File.extname(doc_path).downcase
  unless SUPPORTED_FORMATS.include?(ext)
    return { error: "Unsupported format '#{ext}'. Supported: #{SUPPORTED_FORMATS.join(', ')}" }
  end

  ss = Roo::Spreadsheet.open(doc_path)

  # Select sheet
  active_sheet = resolve_sheet(ss, sheet)
  return active_sheet if active_sheet.is_a?(Hash) && active_sheet[:error]

  ss.default_sheet = active_sheet

  total_rows  = ss.last_row.to_i
  first_row   = ss.first_row.to_i
  header_row  = headers ? ss.row(first_row).map { |h| h.to_s.strip } : nil

  # Determine data rows (skip header if using headers)
  data_start  = headers ? first_row + 1 : first_row
  all_indices = (data_start..total_rows).to_a

  selected = row_range ? filter_rows(all_indices, row_range) : all_indices
  invalid  = selected.reject { |n| n >= first_row && n <= total_rows }
  valid    = selected.select { |n| n >= first_row && n <= total_rows }

  rows = valid.map do |n|
    raw = ss.row(n)
    if headers && header_row
      header_row.zip(raw).to_h
    else
      raw
    end
  end

  @logger.info("SpreadsheetReaderTool: read #{rows.size} rows from '#{active_sheet}'")

  {
    doc_path:       doc_path,
    format:         ext,
    available_sheets: ss.sheets,
    active_sheet:   active_sheet,
    total_rows:     total_rows,
    header_row:     header_row,
    requested_range: row_range || "all",
    invalid_rows:   invalid,
    row_count:      rows.size,
    rows:           rows
  }
rescue => e
  @logger.error("SpreadsheetReaderTool failed for '#{doc_path}': #{e.message}")
  { error: e.message }
end