Class: Rubino::Context::TokenBudget
- Inherits:
-
Object
- Object
- Rubino::Context::TokenBudget
- Defined in:
- lib/rubino/context/token_budget.rb
Overview
Manages token budget calculations and determines when compaction is needed.
Constant Summary collapse
- CHARS_PER_TOKEN =
Rough approximation
4- DEFAULT_CONTEXT_WINDOW =
Fallback when the user hasn’t pinned ‘model.context_length` in config. Generous-but-safe; truncation kicks in via `needs_compaction?` long before the real provider limit would be hit.
128_000
Instance Attribute Summary collapse
-
#context_window ⇒ Object
readonly
Returns the value of attribute context_window.
Instance Method Summary collapse
-
#available_tokens ⇒ Object
Returns the max tokens available for conversation.
-
#compaction_target ⇒ Object
Returns the target token count after compaction.
-
#critical?(messages) ⇒ Boolean
Returns true if critically close to context limit.
-
#estimate_tokens(messages) ⇒ Object
Estimates token count for a set of messages.
-
#initialize(model_id:, config:) ⇒ TokenBudget
constructor
A new instance of TokenBudget.
-
#needs_compaction?(messages) ⇒ Boolean
Returns true if the messages exceed the compaction threshold.
Constructor Details
#initialize(model_id:, config:) ⇒ TokenBudget
Returns a new instance of TokenBudget.
13 14 15 16 17 |
# File 'lib/rubino/context/token_budget.rb', line 13 def initialize(model_id:, config:) @model_id = model_id @config = config @context_window = determine_context_window end |
Instance Attribute Details
#context_window ⇒ Object (readonly)
Returns the value of attribute context_window.
19 20 21 |
# File 'lib/rubino/context/token_budget.rb', line 19 def context_window @context_window end |
Instance Method Details
#available_tokens ⇒ Object
Returns the max tokens available for conversation
22 23 24 25 |
# File 'lib/rubino/context/token_budget.rb', line 22 def available_tokens override = @config.dig("context", "max_tokens") override || @context_window end |
#compaction_target ⇒ Object
Returns the target token count after compaction
52 53 54 |
# File 'lib/rubino/context/token_budget.rb', line 52 def compaction_target (available_tokens * @config.compression_target_ratio).to_i end |
#critical?(messages) ⇒ Boolean
Returns true if critically close to context limit
43 44 45 46 47 48 49 |
# File 'lib/rubino/context/token_budget.rb', line 43 def critical?() return false unless @config.compression_enabled? estimated = estimate_tokens() gateway = (available_tokens * @config.compression_gateway_threshold).to_i estimated > gateway end |
#estimate_tokens(messages) ⇒ Object
Estimates token count for a set of messages
28 29 30 31 |
# File 'lib/rubino/context/token_budget.rb', line 28 def estimate_tokens() total_chars = .sum { |m| (m[:content] || "").length } (total_chars.to_f / CHARS_PER_TOKEN).ceil end |
#needs_compaction?(messages) ⇒ Boolean
Returns true if the messages exceed the compaction threshold
34 35 36 37 38 39 40 |
# File 'lib/rubino/context/token_budget.rb', line 34 def needs_compaction?() return false unless @config.compression_enabled? estimated = estimate_tokens() threshold = (available_tokens * @config.compression_threshold).to_i estimated > threshold end |