Class: RubyLLM::Tokenizer::Backend::Approximate
- Defined in:
- lib/ruby_llm/tokenizer/backend/approximate.rb
Overview
Approximate tokenizer for models with no published tokenizer (notably Anthropic Claude). Wraps a tiktoken encoding as a stand-in. Token counts are typically within ~5-15% of the model’s true count and should not be used for hard limits.
Instance Attribute Summary
Attributes inherited from Tiktoken
Instance Method Summary collapse
- #encode(text) ⇒ Object
- #identifier ⇒ Object
-
#initialize(encoding: "o200k_base") ⇒ Approximate
constructor
A new instance of Approximate.
Methods inherited from Tiktoken
Methods inherited from Base
#analyze, #count, #decode, #truncate
Constructor Details
#initialize(encoding: "o200k_base") ⇒ Approximate
Returns a new instance of Approximate.
13 14 15 16 17 |
# File 'lib/ruby_llm/tokenizer/backend/approximate.rb', line 13 def initialize(encoding: "o200k_base") super @warned = false @warn_mutex = Mutex.new end |
Instance Method Details
#encode(text) ⇒ Object
19 20 21 22 |
# File 'lib/ruby_llm/tokenizer/backend/approximate.rb', line 19 def encode(text) warn_once super end |
#identifier ⇒ Object
24 25 26 |
# File 'lib/ruby_llm/tokenizer/backend/approximate.rb', line 24 def identifier "approximate:#{encoding_name}" end |