Class: Phronomy::Context::TokenBudget

Inherits:
Object
  • Object
show all
Defined in:
lib/phronomy/context/token_budget.rb

Overview

Calculates the effective token budget available for conversation history and injected knowledge within a single LLM request.

The window is divided as follows:

context_window (total) ├─ max_output_tokens (reserved for model output = max_output_tokens) ├─ overhead (reserved for system prompt + tool definitions) └─ effective_input_limit (available for memory + knowledge)

Examples:

Auto-derive from RubyLLM model registry

budget = Phronomy::Context::TokenBudget.new(model: "claude-3-5-sonnet-20241022")

Explicit values (useful for local / unknown models)

budget = Phronomy::Context::TokenBudget.new(
  context_window:    32_768,
  max_output_tokens: 4_096
)

With overhead for instructions + tool definitions

budget = Phronomy::Context::TokenBudget.new(
  model:    "gpt-4o",
  overhead: 800
)

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(model: nil, context_window: nil, max_output_tokens: nil, overhead: 0) ⇒ TokenBudget

Returns a new instance of TokenBudget.

Parameters:

  • model (String, nil) (defaults to: nil)

    model identifier looked up in RubyLLM

  • context_window (Integer, nil) (defaults to: nil)

    explicit total token limit

  • max_output_tokens (Integer, nil) (defaults to: nil)

    explicit output reservation; when nil and model is given, uses max_output_tokens

  • overhead (Integer) (defaults to: 0)

    tokens reserved for instructions/tools



48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# File 'lib/phronomy/context/token_budget.rb', line 48

def initialize(model: nil, context_window: nil, max_output_tokens: nil, overhead: 0)
  @overhead = overhead.to_i

  if context_window
    # Explicit values — no registry lookup needed.
    @context_window = context_window.to_i
    @max_output_tokens = (max_output_tokens || 0).to_i
  elsif model
    ruby_llm_model = lookup_model!(model)
    @context_window = ruby_llm_model.context_window.to_i
    @max_output_tokens = (max_output_tokens || ruby_llm_model.max_output_tokens).to_i
  else
    raise ArgumentError, "Provide either model: or context_window:"
  end
end

Instance Attribute Details

#context_windowInteger (readonly)

Returns total token limit of the model.

Returns:

  • (Integer)

    total token limit of the model



35
36
37
# File 'lib/phronomy/context/token_budget.rb', line 35

def context_window
  @context_window
end

#max_output_tokensInteger (readonly)

Returns tokens reserved for model output.

Returns:

  • (Integer)

    tokens reserved for model output



38
39
40
# File 'lib/phronomy/context/token_budget.rb', line 38

def max_output_tokens
  @max_output_tokens
end

#overheadInteger (readonly)

Returns tokens reserved for instructions and tool definitions.

Returns:

  • (Integer)

    tokens reserved for instructions and tool definitions



41
42
43
# File 'lib/phronomy/context/token_budget.rb', line 41

def overhead
  @overhead
end

Instance Method Details

#available(used: 0) ⇒ Integer

Tokens still available after used tokens have been allocated.

Parameters:

  • used (Integer) (defaults to: 0)

    tokens already committed (e.g. from knowledge injection)

Returns:

  • (Integer)

    remaining tokens (always >= 0)



76
77
78
# File 'lib/phronomy/context/token_budget.rb', line 76

def available(used: 0)
  [effective_input_limit - used.to_i, 0].max
end

#effective_input_limitInteger

Tokens available for conversation history and knowledge after reservations. Always >= 0.

Returns:

  • (Integer)


68
69
70
# File 'lib/phronomy/context/token_budget.rb', line 68

def effective_input_limit
  [@context_window - @max_output_tokens - @overhead, 0].max
end