Class: RubynCode::Context::Manager
- Inherits:
-
Object
- Object
- RubynCode::Context::Manager
- Defined in:
- lib/rubyn_code/context/manager.rb
Overview
Orchestrates context management for a session. Tracks cumulative token usage from LLM responses and triggers compaction strategies when the estimated context size exceeds the configured threshold.
Constant Summary collapse
- CHARS_PER_TOKEN =
4- MICRO_COMPACT_RATIO_CACHED =
Runs micro-compaction every turn and auto-compaction when the context exceeds the threshold. Expects a conversation object that responds to #messages and #messages= (or #replace_messages).
Fraction of the compaction threshold at which micro-compact kicks in. Running it too early busts the prompt cache prefix (mutated messages change the hash, invalidating server-side cached tokens). Anthropic has prompt caching so we delay compaction (0.7). OpenAI has no cache prefix to protect so we compact earlier (0.5).
0.7- MICRO_COMPACT_RATIO_UNCACHED =
0.5
Instance Attribute Summary collapse
-
#current_turn ⇒ Object
readonly
Returns the value of attribute current_turn.
-
#llm_client ⇒ Object
writeonly
Sets the attribute llm_client.
-
#total_input_tokens ⇒ Object
readonly
Returns the value of attribute total_input_tokens.
-
#total_output_tokens ⇒ Object
readonly
Returns the value of attribute total_output_tokens.
Instance Method Summary collapse
-
#advance_turn! ⇒ Object
Advances the turn counter.
- #check_compaction!(conversation) ⇒ Object
-
#estimated_tokens(messages) ⇒ Integer
Rough estimate of token count for a set of messages based on their JSON-serialized character length (~4 chars per token).
-
#initialize(threshold: Config::Defaults::CONTEXT_THRESHOLD_TOKENS, llm_client: nil) ⇒ Manager
constructor
A new instance of Manager.
-
#needs_compaction?(messages) ⇒ Boolean
Returns true if the estimated token count exceeds the threshold.
-
#reset! ⇒ void
Resets cumulative token counters to zero.
-
#track_usage(usage) ⇒ Object
Accumulates token counts from an LLM response usage object.
Constructor Details
#initialize(threshold: Config::Defaults::CONTEXT_THRESHOLD_TOKENS, llm_client: nil) ⇒ Manager
Returns a new instance of Manager.
17 18 19 20 21 22 23 24 |
# File 'lib/rubyn_code/context/manager.rb', line 17 def initialize(threshold: Config::Defaults::CONTEXT_THRESHOLD_TOKENS, llm_client: nil) @threshold = threshold @llm_client = llm_client @total_input_tokens = 0 @total_output_tokens = 0 @last_compaction_turn = -1 @current_turn = 0 end |
Instance Attribute Details
#current_turn ⇒ Object (readonly)
Returns the value of attribute current_turn.
13 14 15 |
# File 'lib/rubyn_code/context/manager.rb', line 13 def current_turn @current_turn end |
#llm_client=(value) ⇒ Object (writeonly)
Sets the attribute llm_client
26 27 28 |
# File 'lib/rubyn_code/context/manager.rb', line 26 def llm_client=(value) @llm_client = value end |
#total_input_tokens ⇒ Object (readonly)
Returns the value of attribute total_input_tokens.
13 14 15 |
# File 'lib/rubyn_code/context/manager.rb', line 13 def total_input_tokens @total_input_tokens end |
#total_output_tokens ⇒ Object (readonly)
Returns the value of attribute total_output_tokens.
13 14 15 |
# File 'lib/rubyn_code/context/manager.rb', line 13 def total_output_tokens @total_output_tokens end |
Instance Method Details
#advance_turn! ⇒ Object
Advances the turn counter. Call once per iteration so that duplicate compaction calls within the same turn are skipped.
30 31 32 |
# File 'lib/rubyn_code/context/manager.rb', line 30 def advance_turn! @current_turn += 1 end |
#check_compaction!(conversation) ⇒ Object
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
# File 'lib/rubyn_code/context/manager.rb', line 76 def check_compaction!(conversation) # Guard: skip if compaction already ran this turn return if @last_compaction_turn == @current_turn @last_compaction_turn = @current_turn = conversation. # Step 1: Zero-cost micro-compact — but only when we're approaching # the compaction threshold. Running it every turn mutates old messages, # which invalidates the prompt cache prefix and wastes tokens. est = estimated_tokens() MicroCompact.call() if est > (@threshold * micro_compact_ratio) return unless needs_compaction?() # Step 2: Try context collapse (snip old messages, no LLM call) collapsed = ContextCollapse.call(, threshold: @threshold) if collapsed (conversation, collapsed) return end # Step 3: Full LLM-driven auto-compact (expensive, last resort) return unless @llm_client compactor = Compactor.new(llm_client: @llm_client, threshold: @threshold) = compactor.auto_compact!() (conversation, ) end |
#estimated_tokens(messages) ⇒ Integer
Rough estimate of token count for a set of messages based on their JSON-serialized character length (~4 chars per token).
47 48 49 50 51 52 |
# File 'lib/rubyn_code/context/manager.rb', line 47 def estimated_tokens() json = JSON.generate() (json.length.to_f / CHARS_PER_TOKEN).ceil rescue JSON::GeneratorError 0 end |
#needs_compaction?(messages) ⇒ Boolean
Returns true if the estimated token count exceeds the threshold.
58 59 60 |
# File 'lib/rubyn_code/context/manager.rb', line 58 def needs_compaction?() estimated_tokens() > @threshold end |
#reset! ⇒ void
This method returns an undefined value.
Resets cumulative token counters to zero.
110 111 112 113 114 115 |
# File 'lib/rubyn_code/context/manager.rb', line 110 def reset! @total_input_tokens = 0 @total_output_tokens = 0 @last_compaction_turn = -1 @current_turn = 0 end |
#track_usage(usage) ⇒ Object
Accumulates token counts from an LLM response usage object.
37 38 39 40 |
# File 'lib/rubyn_code/context/manager.rb', line 37 def track_usage(usage) @total_input_tokens += usage.input_tokens.to_i @total_output_tokens += usage.output_tokens.to_i end |