Class: LLM::Compactor
- Inherits:
-
Object
- Object
- LLM::Compactor
- Defined in:
- lib/llm/compactor.rb
Overview
LLM::Compactor summarizes older context messages into a smaller replacement message when a context grows too large.
This work is directly inspired by the compaction approach developed by General Intelligence Systems in [Brute](github.com/general-intelligence-systems/brute).
The compactor can also use a different model from the main context by setting ‘model:` in the compactor config. Compaction thresholds are opt-in: provide `message_threshold:` and/or `token_threshold:` to enable policy- driven compaction. `token_threshold:` accepts either an integer token count or a percentage string like `“90%”`, which resolves against the current model context window.
Constant Summary collapse
- DEFAULTS =
{ retention_window: 8, model: nil }.freeze
Instance Attribute Summary collapse
- #config ⇒ Hash readonly
Instance Method Summary collapse
-
#compact!(prompt = nil) ⇒ LLM::Message?
Summarize older messages and replace them with a compact summary.
-
#compactable?(prompt = nil) ⇒ Boolean
(also: #compact?)
Returns true when the context should be compacted.
-
#initialize(ctx, config = {}) ⇒ Compactor
constructor
A new instance of Compactor.
Constructor Details
Instance Attribute Details
#config ⇒ Hash (readonly)
25 26 27 |
# File 'lib/llm/compactor.rb', line 25 def config @config end |
Instance Method Details
#compact!(prompt = nil) ⇒ LLM::Message?
Summarize older messages and replace them with a compact summary.
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/llm/compactor.rb', line 69 def compact!(prompt = nil) return nil if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any? = ctx..reject(&:system?) retention_window = [config[:retention_window], .size].min return nil unless .size > retention_window stream = ctx.params[:stream] stream.on_compaction(ctx, self) if LLM::Stream === stream recent = older = [0...(.size - recent.size)] summary = LLM::Message.new(ctx.llm.user_role, "[Previous conversation summary]\n\n#{summarize(older)}", {compaction: true}) ctx..replace([*ctx..take_while(&:system?), summary, *recent]) ctx.compacted = true stream.on_compaction_finish(ctx, self) if LLM::Stream === stream summary end |
#compactable?(prompt = nil) ⇒ Boolean Also known as: compact?
Returns true when the context should be compacted.
When ‘token_threshold:` is a percentage string such as `“90%”`, the threshold is resolved against the current context window and compared to the current total token usage.
55 56 57 58 59 60 61 |
# File 'lib/llm/compactor.rb', line 55 def compactable?(prompt = nil) return false if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any? = ctx..reject(&:system?) return true if config[:message_threshold] && .size > config[:message_threshold] return true if token_threshold and ctx.usage.total_tokens > token_threshold false end |