Class: Phronomy::LlmContextWindow::TokenBudget
- Inherits:
-
Object
- Object
- Phronomy::LlmContextWindow::TokenBudget
- Defined in:
- lib/phronomy/llm_context_window/token_budget.rb
Overview
Calculates the effective token budget available for conversation history and injected knowledge within a single LLM request.
The window is divided as follows:
context_window (total) ├─ max_output_tokens (reserved for model output = max_output_tokens) ├─ overhead (reserved for system prompt + tool definitions) └─ effective_input_limit (available for memory + knowledge)
Instance Attribute Summary collapse
-
#context_window ⇒ Integer
readonly
Total token limit of the model.
-
#max_output_tokens ⇒ Integer
readonly
Tokens reserved for model output.
-
#overhead ⇒ Integer
readonly
Tokens reserved for instructions and tool definitions.
Instance Method Summary collapse
-
#available(used: 0) ⇒ Integer
private
Tokens still available after
usedtokens have been allocated. -
#effective_input_limit ⇒ Integer
private
Tokens available for conversation history and knowledge after reservations.
-
#initialize(model: nil, context_window: nil, max_output_tokens: nil, overhead: 0) ⇒ TokenBudget
constructor
private
mutant:disable - multiple genuine equivalent mutations: overhead/context_window/max_output_tokens .to_i vs .to_int vs Integer() vs omitted are equivalent for Integer inputs; (max_output_tokens||0).to_i vs (max_output_tokens).to_i and (||nil).to_i are genuine because nil.to_i==0; overhead:nil default is genuine because nil.to_i==0.
Constructor Details
#initialize(model: nil, context_window: nil, max_output_tokens: nil, overhead: 0) ⇒ TokenBudget
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
mutant:disable - multiple genuine equivalent mutations: overhead/context_window/max_output_tokens .to_i vs .to_int vs Integer() vs omitted are equivalent for Integer inputs; (max_output_tokens||0).to_i vs (max_output_tokens).to_i and (||nil).to_i are genuine because nil.to_i==0; overhead:nil default is genuine because nil.to_i==0
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
# File 'lib/phronomy/llm_context_window/token_budget.rb', line 50 def initialize(model: nil, context_window: nil, max_output_tokens: nil, overhead: 0) @overhead = overhead.to_i if context_window # Explicit values — no registry lookup needed. @context_window = context_window.to_i @max_output_tokens = (max_output_tokens || 0).to_i elsif model ruby_llm_model = lookup_model!(model) @context_window = ruby_llm_model.context_window.to_i @max_output_tokens = (max_output_tokens || ruby_llm_model.max_output_tokens).to_i else raise ArgumentError, "Provide either model: or context_window:" end end |
Instance Attribute Details
#context_window ⇒ Integer (readonly)
Returns total token limit of the model.
35 36 37 |
# File 'lib/phronomy/llm_context_window/token_budget.rb', line 35 def context_window @context_window end |
#max_output_tokens ⇒ Integer (readonly)
Returns tokens reserved for model output.
38 39 40 |
# File 'lib/phronomy/llm_context_window/token_budget.rb', line 38 def max_output_tokens @max_output_tokens end |
#overhead ⇒ Integer (readonly)
Returns tokens reserved for instructions and tool definitions.
41 42 43 |
# File 'lib/phronomy/llm_context_window/token_budget.rb', line 41 def overhead @overhead end |
Instance Method Details
#available(used: 0) ⇒ Integer
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Tokens still available after used tokens have been allocated.
mutant:disable - used.to_i vs used vs used.to_int vs Integer(used) are genuine equivalents when used is an Integer; used:nil default is genuine because nil.to_i==0==default 0
81 82 83 |
# File 'lib/phronomy/llm_context_window/token_budget.rb', line 81 def available(used: 0) [effective_input_limit - used.to_i, 0].max end |
#effective_input_limit ⇒ Integer
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Tokens available for conversation history and knowledge after reservations. Always >= 0.
71 72 73 |
# File 'lib/phronomy/llm_context_window/token_budget.rb', line 71 def effective_input_limit [@context_window - @max_output_tokens - @overhead, 0].max end |