Class: Phronomy::Memory::Compression::Summary

Inherits:
Base
  • Object
show all
Defined in:
lib/phronomy/memory/compression/summary.rb

Overview

Compaction strategy that summarizes old messages with an LLM.

When the total estimated token count of the uncompacted message history exceeds +max_tokens+, all messages except the most recent +keep+ are summarized by an LLM. The original messages are preserved in Storage (via ConversationManager); this class only decides whether compaction is needed and produces the summary text.

The #compress method now returns a Hash instead of a plain Array: { messages: Array, # context-ready message list compaction: Hash | nil # { start_seq:, end_seq:, summary_text: } # nil when no compaction was performed }

ConversationManager uses the :compaction entry to persist the compaction record in Storage, ensuring originals are never discarded.

Examples:

compressor = Phronomy::Memory::Compression::Summary.new(
  max_tokens: 4000,
  summarizer_model: "gpt-4o-mini"
)
manager = Phronomy::Memory::ConversationManager.new(
  storage: storage,
  retrieval: retrieval,
  compression: compressor
)

Instance Method Summary collapse

Constructor Details

#initialize(max_tokens: 4000, keep: 5, summarizer_model: nil, summarizer_provider: nil) ⇒ Summary

Returns a new instance of Summary.

Parameters:

  • max_tokens (Integer) (defaults to: 4000)

    token threshold above which old messages are compacted

  • keep (Integer) (defaults to: 5)

    number of recent messages to preserve verbatim

  • summarizer_model (String, nil) (defaults to: nil)

    LLM model for summarization; nil uses global default

  • summarizer_provider (Symbol, nil) (defaults to: nil)

    LLM provider; required for unregistered models



41
42
43
44
45
46
# File 'lib/phronomy/memory/compression/summary.rb', line 41

def initialize(max_tokens: 4000, keep: 5, summarizer_model: nil, summarizer_provider: nil)
  @max_tokens = max_tokens
  @keep = keep
  @summarizer_model = summarizer_model
  @summarizer_provider = summarizer_provider
end

Instance Method Details

#compress(thread_id:, messages:, seq_offset: 0) ⇒ Hash

Evaluate whether compaction is needed and produce a summary if so.

+seq_offset+ is the seq number of messages[0] in the raw history. ConversationManager passes this so the compaction record can reference the correct seq range in Storage.

Parameters:

  • thread_id (String)
  • messages (Array)

    uncompacted messages to consider

  • seq_offset (Integer) (defaults to: 0)

    seq number assigned to messages[0]

Returns:

  • (Hash)

    { messages: Array, compaction: Hash|nil } compaction is { start_seq:, end_seq:, summary_text: } or nil



59
60
61
62
63
64
65
66
67
# File 'lib/phronomy/memory/compression/summary.rb', line 59

def compress(thread_id:, messages:, seq_offset: 0)
  estimated = messages.sum { |m| Phronomy::Context::TokenEstimator.estimate(m.content.to_s) }

  if estimated > @max_tokens && messages.length > @keep
    compact(messages, seq_offset: seq_offset)
  else
    {messages: messages, compaction: nil}
  end
end