Class: LLM::Compactor

Inherits:
Object
  • Object
show all
Defined in:
lib/llm/compactor.rb

Overview

LLM::Compactor summarizes older context messages into a smaller replacement message when a context grows too large.

This work is directly inspired by the compaction approach developed by General Intelligence Systems in [Brute](github.com/general-intelligence-systems/brute).

The compactor can also use a different model from the main context by setting ‘model:` in the compactor config. Compaction thresholds are opt-in: provide `message_threshold:` and/or `token_threshold:` to enable policy- driven compaction. `token_threshold:` accepts either an integer token count or a percentage string like `“90%”`, which resolves against the current model context window.

Constant Summary collapse

DEFAULTS =
{
  retention_window: 8,
  model: nil
}.freeze

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(ctx, config = {}) ⇒ Compactor

Returns a new instance of Compactor.

Parameters:

Options Hash (config):

  • :token_threshold (Integer, String, nil)

    Enables token-based compaction. Integer values are treated as a fixed token count. Percentage strings like ‘“90%”` are resolved against LLM::Context#context_window; if the context window is unknown, the percentage threshold is treated as disabled.

  • :message_threshold (Integer, nil)

    Enables message-count-based compaction.

  • :retention_window (Integer)
  • :model (String, nil)

    The model to use for the summarization request. Defaults to the current context model.



41
42
43
44
# File 'lib/llm/compactor.rb', line 41

def initialize(ctx, config = {})
  @ctx = ctx
  @config = DEFAULTS.merge(config)
end

Instance Attribute Details

#configHash (readonly)

Returns:

  • (Hash)


25
26
27
# File 'lib/llm/compactor.rb', line 25

def config
  @config
end

Instance Method Details

#compact!(prompt = nil) ⇒ LLM::Message?

Summarize older messages and replace them with a compact summary.

Parameters:

  • prompt (Object) (defaults to: nil)

    The next prompt or turn input

Returns:



69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
# File 'lib/llm/compactor.rb', line 69

def compact!(prompt = nil)
  return nil if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any?
  messages = ctx.messages.reject(&:system?)
  retention_window = [config[:retention_window], messages.size].min
  return nil unless messages.size > retention_window
  stream = ctx.params[:stream]
  stream.on_compaction(ctx, self) if LLM::Stream === stream
  recent = retained_messages
  older = messages[0...(messages.size - recent.size)]
  summary = LLM::Message.new(ctx.llm.user_role, "[Previous conversation summary]\n\n#{summarize(older)}", {compaction: true})
  ctx.messages.replace([*ctx.messages.take_while(&:system?), summary, *recent])
  ctx.compacted = true
  stream.on_compaction_finish(ctx, self) if LLM::Stream === stream
  summary
end

#compactable?(prompt = nil) ⇒ Boolean Also known as: compact?

Returns true when the context should be compacted.

When ‘token_threshold:` is a percentage string such as `“90%”`, the threshold is resolved against the current context window and compared to the current total token usage.

Parameters:

  • prompt (Object) (defaults to: nil)

    The next prompt or turn input

Returns:

  • (Boolean)


55
56
57
58
59
60
61
# File 'lib/llm/compactor.rb', line 55

def compactable?(prompt = nil)
  return false if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any?
  messages = ctx.messages.reject(&:system?)
  return true if config[:message_threshold] && messages.size > config[:message_threshold]
  return true if token_threshold and ctx.usage.total_tokens > token_threshold
  false
end