Class: Pikuri::Agent::Tokens

Inherits:

Data

Object
Data
Pikuri::Agent::Tokens

show all

Defined in:: lib/pikuri/agent/tokens.rb

Overview

Provider-reported token usage for a single assistant turn, copied off a RubyLLM::Message‘s tokens block. Delivered to listeners through Listener::MessageListener#on_tokens rather than the Message stream — it’s metadata about an exchange, not an event in it.

Emitted by Listener::MessageListener#dispatch_chat_message on every assistant after_message event, including pure tool-call turns where Message::Assistant would have been filtered out for empty content. Those are exactly the turns where context-window growth matters most.

All counts are Integer, nil. nil means the provider did not report that field — common with local llama.cpp / Ollama servers that leave parts of the OpenAI usage block empty. Listeners treat nil as zero.

The fields input, cached, and cache_creation are **exclusive portions of this turn’s full prompt** under the shape ruby_llm exposes for llama.cpp and Anthropic: they sum to the total prompt size processed on this request. OpenAI proper nests cached_tokens inside its prompt_tokens instead — if pikuri ever talks there directly, the sum formula needs revisiting.

input — newly-processed (uncached) prompt tokens this turn.
output — tokens in this single assistant reply.
cached — portion of this turn’s prompt served from the provider’s prompt cache. Still counts against the context window (caching is a speed/cost optimization, not a context- savings mechanism).
cache_creation — portion of this turn’s prompt written into the prompt cache. Anthropic-specific; usually nil on OpenAI-compatible local servers.
thinking — extended-thinking (Anthropic) or reasoning (OpenAI o-series) tokens produced on this turn. nil on providers without a reasoning channel.
model_id — provider-side model name as reported on the response; useful when a process targets multiple models.

Computing “current context window size”

input cached + cache_creation+ is the size of the prompt processed on this turn. Add output to get tokens consumed by the conversation through this turn — this turn’s prompt plus its reply, both of which the model will re-process on the next turn. That’s what climbs toward RubyLLM::ContextLengthExceededError and is the snapshot Listener::TokenLog#context_window_size tracks (without the output term, a long reply stays invisible in the headline until the next turn pulls it in as cached prompt).

Instance Attribute Summary collapse

#cache_creation ⇒ Object readonly

Returns the value of attribute cache_creation.
#cached ⇒ Object readonly

Returns the value of attribute cached.
#input ⇒ Object readonly

Returns the value of attribute input.
#model_id ⇒ Object readonly

Returns the value of attribute model_id.
#output ⇒ Object readonly

Returns the value of attribute output.
#thinking ⇒ Object readonly

Returns the value of attribute thinking.

Instance Attribute Details

#cache_creation ⇒ `Object` (readonly)

Returns the value of attribute cache_creation

Returns:

(Object) —

the current value of cache_creation



54
55
56

# File 'lib/pikuri/agent/tokens.rb', line 54

def cache_creation
  @cache_creation
end

#cached ⇒ `Object` (readonly)

Returns the value of attribute cached

Returns:

(Object) —

the current value of cached



54
55
56

# File 'lib/pikuri/agent/tokens.rb', line 54

def cached
  @cached
end

#input ⇒ `Object` (readonly)

Returns the value of attribute input

Returns:

(Object) —

the current value of input



54
55
56

# File 'lib/pikuri/agent/tokens.rb', line 54

def input
  @input
end

#model_id ⇒ `Object` (readonly)

Returns the value of attribute model_id

Returns:

(Object) —

the current value of model_id



54
55
56

# File 'lib/pikuri/agent/tokens.rb', line 54

def model_id
  @model_id
end

#output ⇒ `Object` (readonly)

Returns the value of attribute output

Returns:

(Object) —

the current value of output



54
55
56

# File 'lib/pikuri/agent/tokens.rb', line 54

def output
  @output
end

#thinking ⇒ `Object` (readonly)

Returns the value of attribute thinking

Returns:

(Object) —

the current value of thinking



54
55
56

# File 'lib/pikuri/agent/tokens.rb', line 54

def thinking
  @thinking
end

Class: Pikuri::Agent::Tokens

Overview

Computing “current context window size”

Instance Attribute Summary collapse

Instance Attribute Details

#cache_creation ⇒ Object (readonly)

#cached ⇒ Object (readonly)

#input ⇒ Object (readonly)

#model_id ⇒ Object (readonly)

#output ⇒ Object (readonly)

#thinking ⇒ Object (readonly)

#cache_creation ⇒ `Object` (readonly)

#cached ⇒ `Object` (readonly)

#input ⇒ `Object` (readonly)

#model_id ⇒ `Object` (readonly)

#output ⇒ `Object` (readonly)

#thinking ⇒ `Object` (readonly)