Class: LLM::Compactor
- Inherits:
-
Object
- Object
- LLM::Compactor
- Defined in:
- lib/llm/compactor.rb
Overview
LLM::Compactor summarizes older context messages into a smaller replacement message when a context grows too large.
This work is directly inspired by the compaction approach developed by General Intelligence Systems in [Brute](github.com/general-intelligence-systems/brute).
The compactor can also use a different model from the main context by setting ‘model:` in the compactor config. By default, `token_threshold` is 10% less than the current context window, or `100_000` when the context window is unknown. Set `message_threshold:` or `token_threshold:` to `nil` to disable that constraint.
Constant Summary collapse
- DEFAULT_TOKEN_THRESHOLD =
100_000- DEFAULTS =
{ message_threshold: 200, retention_window: 8, model: nil }.freeze
Instance Attribute Summary collapse
- #config ⇒ Hash readonly
Instance Method Summary collapse
-
#compact!(prompt = nil) ⇒ LLM::Message?
Summarize older messages and replace them with a compact summary.
-
#compact?(prompt = nil) ⇒ Boolean
Returns true when the context should be compacted.
-
#initialize(ctx, **config) ⇒ Compactor
constructor
A new instance of Compactor.
Constructor Details
Instance Attribute Details
#config ⇒ Hash (readonly)
26 27 28 |
# File 'lib/llm/compactor.rb', line 26 def config @config end |
Instance Method Details
#compact!(prompt = nil) ⇒ LLM::Message?
Summarize older messages and replace them with a compact summary.
65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/llm/compactor.rb', line 65 def compact!(prompt = nil) return nil if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any? = ctx..reject(&:system?) retention_window = [config[:retention_window], .size].min return nil unless .size > retention_window stream = ctx.params[:stream] stream.on_compaction(ctx, self) if LLM::Stream === stream recent = older = [0...(.size - recent.size)] summary = LLM::Message.new(ctx.llm.user_role, "[Previous conversation summary]\n\n#{summarize(older)}") ctx..replace([*ctx..take_while(&:system?), summary, *recent]) stream.on_compaction_finish(ctx, self) if LLM::Stream === stream summary end |
#compact?(prompt = nil) ⇒ Boolean
Returns true when the context should be compacted
51 52 53 54 55 56 57 58 |
# File 'lib/llm/compactor.rb', line 51 def compact?(prompt = nil) return false if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any? = ctx..reject(&:system?) return true if config[:message_threshold] && .size > config[:message_threshold] usage = ctx.usage return true if config[:token_threshold] && usage && usage.total_tokens > config[:token_threshold] false end |