Module: OllamaChat::Utils::TokenEstimator

Defined in:
lib/ollama_chat/utils/token_estimator.rb

Overview

Provides crude estimations of token counts for various models.

Since actual tokenization depends on the specific model’s BPE/SentencePiece vocabulary, these methods provide a “best effort” approximation based on average character/byte ratios.

Class Method Summary collapse

Class Method Details

.estimate(bytes) ⇒ Integer

Estimates tokens based on byte size. Assumes an average of 3.5 bytes per token.

Parameters:

  • bytes (Integer)

    The size of the content in bytes

Returns:

  • (Integer)

    The estimated number of tokens



12
13
14
# File 'lib/ollama_chat/utils/token_estimator.rb', line 12

def self.estimate(bytes)
  (bytes.to_f / 3.5).ceil
end