Class: RubyPi::Context::Compaction

Inherits:
Object
  • Object
show all
Defined in:
lib/ruby_pi/context/compaction.rb

Overview

Manages context window size by summarizing older messages when the estimated token count exceeds a configurable threshold. The most recent N messages are always preserved to maintain conversational coherence.

Examples:

Configuring compaction

compaction = RubyPi::Context::Compaction.new(
  max_tokens: 8000,
  summary_model: model,
  preserve_last_n: 4
)
compacted = compaction.compact(messages, system_prompt)

Constant Summary collapse

CHARS_PER_TOKEN =

Average characters per token — a rough heuristic that avoids the need for provider-specific tokenizers. Errs on the conservative side.

4

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(max_tokens: 8000, summary_model:, preserve_last_n: 4) ⇒ Compaction

Creates a new Compaction instance.

Parameters:

  • max_tokens (Integer) (defaults to: 8000)

    trigger compaction above this token estimate (default: 8000)

  • summary_model (RubyPi::LLM::BaseProvider)

    model for summarization

  • preserve_last_n (Integer) (defaults to: 4)

    always keep the last N messages (default: 4)



51
52
53
54
55
56
# File 'lib/ruby_pi/context/compaction.rb', line 51

def initialize(max_tokens: 8000, summary_model:, preserve_last_n: 4)
  @max_tokens = max_tokens
  @summary_model = summary_model
  @preserve_last_n = preserve_last_n
  @emitter = nil
end

Instance Attribute Details

#emitter#emit?

Returns optional event emitter for :compaction events.

Returns:

  • (#emit, nil)

    optional event emitter for :compaction events



42
43
44
# File 'lib/ruby_pi/context/compaction.rb', line 42

def emitter
  @emitter
end

#max_tokensInteger (readonly)

Returns the token threshold above which compaction triggers.

Returns:

  • (Integer)

    the token threshold above which compaction triggers



33
34
35
# File 'lib/ruby_pi/context/compaction.rb', line 33

def max_tokens
  @max_tokens
end

#preserve_last_nInteger (readonly)

Returns number of recent messages always preserved.

Returns:

  • (Integer)

    number of recent messages always preserved



39
40
41
# File 'lib/ruby_pi/context/compaction.rb', line 39

def preserve_last_n
  @preserve_last_n
end

#summary_modelRubyPi::LLM::BaseProvider (readonly)

Returns the model used to generate summaries.

Returns:



36
37
38
# File 'lib/ruby_pi/context/compaction.rb', line 36

def summary_model
  @summary_model
end

Instance Method Details

#compact(messages, system_prompt) ⇒ Array<Hash>?

Compacts the message history if the estimated token count exceeds the threshold. Returns the compacted messages array, or nil if no compaction was needed.

The compaction process:

  1. Estimate total tokens for system_prompt + all messages.

  2. If under threshold, return nil (no compaction needed).

  3. Split messages into “droppable” (older) and “preserved” (recent).

  4. Summarize the droppable messages via the summary model.

  5. Return a new array: [summary_message] + preserved_messages.

Parameters:

  • messages (Array<Hash>)

    the current conversation history

  • system_prompt (String)

    the system prompt (included in estimate)

Returns:

  • (Array<Hash>, nil)

    compacted messages, or nil if not needed



72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/ruby_pi/context/compaction.rb', line 72

def compact(messages, system_prompt)
  total_tokens = estimate_tokens(system_prompt, messages)
  return nil if total_tokens <= @max_tokens

  # Split into messages to summarize and messages to keep
  preserved_count = [@preserve_last_n, messages.size].min
  droppable = messages[0...(messages.size - preserved_count)]
  preserved = messages[(messages.size - preserved_count)..]

  # If there's nothing to drop, we can't compact further
  return nil if droppable.empty?

  # Generate a summary of the dropped messages
  summary = summarize(droppable)

  # Emit compaction event if an emitter is available
  @emitter&.emit(:compaction, dropped_count: droppable.size, summary: summary)

  # Build the compacted history: summary as a system-context message + preserved
  summary_message = {
    role: :system,
    content: "[Conversation Summary]\n#{summary}"
  }

  [summary_message] + preserved
end

#estimate_tokens(system_prompt, messages) ⇒ Integer

Estimates the total token count for a system prompt and message array using the character-based heuristic.

Parameters:

  • system_prompt (String)

    the system prompt text

  • messages (Array<Hash>)

    conversation messages

Returns:

  • (Integer)

    estimated token count



105
106
107
108
109
110
111
112
113
114
115
# File 'lib/ruby_pi/context/compaction.rb', line 105

def estimate_tokens(system_prompt, messages)
  total_chars = system_prompt.to_s.length

  messages.each do |msg|
    total_chars += msg[:content].to_s.length
    # Account for role and structural overhead (~10 tokens per message)
    total_chars += 40
  end

  (total_chars.to_f / CHARS_PER_TOKEN).ceil
end