Class: LlmDocsBuilder::TokenEstimator
- Inherits:
-
Object
- Object
- LlmDocsBuilder::TokenEstimator
- Defined in:
- lib/llm_docs_builder/token_estimator.rb
Overview
Estimates token count for text content using character-based approximation
Provides token estimation without requiring external tokenizer dependencies. Uses the common heuristic that ~4 characters equals 1 token for English text, which works reasonably well for documentation and markdown content.
Constant Summary collapse
- DEFAULT_CHARS_PER_TOKEN =
Default number of characters per token
4.0
Instance Attribute Summary collapse
-
#chars_per_token ⇒ Float
readonly
Characters per token ratio.
Class Method Summary collapse
-
.estimate(content, chars_per_token: DEFAULT_CHARS_PER_TOKEN) ⇒ Integer
Estimate token count (class method for convenience).
Instance Method Summary collapse
-
#estimate(content) ⇒ Integer
Estimate token count for given content.
-
#initialize(chars_per_token: DEFAULT_CHARS_PER_TOKEN) ⇒ TokenEstimator
constructor
Initialize a new token estimator.
Constructor Details
#initialize(chars_per_token: DEFAULT_CHARS_PER_TOKEN) ⇒ TokenEstimator
Initialize a new token estimator
29 30 31 |
# File 'lib/llm_docs_builder/token_estimator.rb', line 29 def initialize(chars_per_token: DEFAULT_CHARS_PER_TOKEN) @chars_per_token = chars_per_token.to_f end |
Instance Attribute Details
#chars_per_token ⇒ Float (readonly)
Returns characters per token ratio.
24 25 26 |
# File 'lib/llm_docs_builder/token_estimator.rb', line 24 def chars_per_token @chars_per_token end |
Class Method Details
.estimate(content, chars_per_token: DEFAULT_CHARS_PER_TOKEN) ⇒ Integer
Estimate token count (class method for convenience)
48 49 50 |
# File 'lib/llm_docs_builder/token_estimator.rb', line 48 def self.estimate(content, chars_per_token: DEFAULT_CHARS_PER_TOKEN) new(chars_per_token: chars_per_token).estimate(content) end |
Instance Method Details
#estimate(content) ⇒ Integer
Estimate token count for given content
37 38 39 40 41 |
# File 'lib/llm_docs_builder/token_estimator.rb', line 37 def estimate(content) return 0 if content.nil? || content.empty? (content.length / chars_per_token).round end |