Class: LLM::Compactor

Inherits:
Object
  • Object
show all
Defined in:
lib/llm/compactor.rb

Overview

LLM::Compactor summarizes older context messages into a smaller replacement message when a context grows too large.

This work is directly inspired by the compaction approach developed by General Intelligence Systems in [Brute](github.com/general-intelligence-systems/brute).

The compactor can also use a different model from the main context by setting ‘model:` in the compactor config. By default, `token_threshold` is 10% less than the current context window, or `100_000` when the context window is unknown. Set `message_threshold:` or `token_threshold:` to `nil` to disable that constraint.

Constant Summary collapse

DEFAULT_TOKEN_THRESHOLD =
100_000
DEFAULTS =
{
  message_threshold: 200,
  retention_window: 8,
  model: nil
}.freeze

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(ctx, **config) ⇒ Compactor

Returns a new instance of Compactor.

Parameters:

Options Hash (**config):

  • :token_threshold (Integer)

    Defaults to 10% less than the current context window, or ‘100_000` when the context window is unknown. Set to `nil` to disable token-based compaction.

  • :message_threshold (Integer)

    Set to ‘nil` to disable message-count-based compaction.

  • :retention_window (Integer)
  • :model (String, nil)

    The model to use for the summarization request. Defaults to the current context model.



41
42
43
44
# File 'lib/llm/compactor.rb', line 41

def initialize(ctx, **config)
  @ctx = ctx
  @config = DEFAULTS.merge(token_threshold: default_token_threshold).merge(config)
end

Instance Attribute Details

#configHash (readonly)

Returns:

  • (Hash)


26
27
28
# File 'lib/llm/compactor.rb', line 26

def config
  @config
end

Instance Method Details

#compact!(prompt = nil) ⇒ LLM::Message?

Summarize older messages and replace them with a compact summary.

Parameters:

  • prompt (Object) (defaults to: nil)

    The next prompt or turn input

Returns:



65
66
67
68
69
70
71
72
73
74
75
76
77
78
# File 'lib/llm/compactor.rb', line 65

def compact!(prompt = nil)
  return nil if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any?
  messages = ctx.messages.reject(&:system?)
  retention_window = [config[:retention_window], messages.size].min
  return nil unless messages.size > retention_window
  stream = ctx.params[:stream]
  stream.on_compaction(ctx, self) if LLM::Stream === stream
  recent = retained_messages
  older = messages[0...(messages.size - recent.size)]
  summary = LLM::Message.new(ctx.llm.user_role, "[Previous conversation summary]\n\n#{summarize(older)}")
  ctx.messages.replace([*ctx.messages.take_while(&:system?), summary, *recent])
  stream.on_compaction_finish(ctx, self) if LLM::Stream === stream
  summary
end

#compact?(prompt = nil) ⇒ Boolean

Returns true when the context should be compacted

Parameters:

  • prompt (Object) (defaults to: nil)

    The next prompt or turn input

Returns:

  • (Boolean)


51
52
53
54
55
56
57
58
# File 'lib/llm/compactor.rb', line 51

def compact?(prompt = nil)
  return false if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any?
  messages = ctx.messages.reject(&:system?)
  return true if config[:message_threshold] && messages.size > config[:message_threshold]
  usage = ctx.usage
  return true if config[:token_threshold] && usage && usage.total_tokens > config[:token_threshold]
  false
end