Class: Clacky::MessageCompressor
- Inherits:
-
Object
- Object
- Clacky::MessageCompressor
- Defined in:
- lib/clacky/agent/message_compressor.rb
Overview
Message compressor using Insert-then-Compress strategy
New Strategy: Instead of creating a separate API call for compression, we insert a compression instruction into the current conversation flow. This allows us to reuse the existing cache (system prompt + tools) and only pay for processing the new compression instruction.
Flow:
-
Agent detects compression threshold is reached
-
Compressor builds a compression instruction message
-
Agent inserts this message and calls LLM (with cache reuse!)
-
LLM returns compressed summary
-
Compressor rebuilds message list: system + summary + recent messages
-
Agent continues with new message list (cache will rebuild from here)
Benefits:
-
Compression call reuses existing cache (huge token savings)
-
Only one cache rebuild after compression (vs two with old approach)
Constant Summary collapse
- COMPRESSION_PROMPT =
<<~PROMPT.freeze ═══════════════════════════════════════════════════════════════ CRITICAL: TASK CHANGE - MEMORY COMPRESSION MODE ═══════════════════════════════════════════════════════════════ The conversation above has ENDED. You are now in MEMORY COMPRESSION MODE. CRITICAL INSTRUCTIONS - READ CAREFULLY: 1. This is NOT a continuation of the conversation 2. DO NOT respond to any requests in the conversation above 3. DO NOT call ANY tools or functions 4. DO NOT use tool_calls in your response 5. Your response MUST be PURE TEXT ONLY YOUR ONLY TASK: Create a comprehensive summary of the conversation above. REQUIRED RESPONSE FORMAT: Your response MUST start with <analysis> or <summary> tags. No other format is acceptable. Follow the detailed compression prompt structure provided earlier. Focus on: - User's explicit requests and intents - Key technical concepts and code changes - Files examined and modified - Errors encountered and fixes applied - Current work status and pending tasks Begin your summary NOW. Remember: PURE TEXT response only, starting with <analysis> or <summary> tags. PROMPT
Instance Method Summary collapse
-
#build_compression_message(messages, recent_messages: []) ⇒ Hash
Generate compression instruction message to be inserted into conversation This enables cache reuse by using the same API call with tools.
-
#initialize(client, model: nil) ⇒ MessageCompressor
constructor
A new instance of MessageCompressor.
- #parse_compressed_result(result, chunk_path: nil) ⇒ Object
-
#rebuild_with_compression(compressed_content, original_messages:, recent_messages:, chunk_path: nil) ⇒ Array<Hash>
Parse LLM response and rebuild message list with compression.
Constructor Details
#initialize(client, model: nil) ⇒ MessageCompressor
Returns a new instance of MessageCompressor.
53 54 55 56 |
# File 'lib/clacky/agent/message_compressor.rb', line 53 def initialize(client, model: nil) @client = client @model = model end |
Instance Method Details
#build_compression_message(messages, recent_messages: []) ⇒ Hash
Generate compression instruction message to be inserted into conversation This enables cache reuse by using the same API call with tools
SIMPLIFIED APPROACH:
-
Don’t duplicate conversation history in the compression message
-
LLM can already see all messages, just ask it to compress
-
Keep the instruction small for better cache efficiency
69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
# File 'lib/clacky/agent/message_compressor.rb', line 69 def (, recent_messages: []) # Get messages to compress (exclude system message and recent messages) = .reject { |m| m[:role] == "system" || .include?(m) } # If nothing to compress, return nil return nil if .empty? # Simple compression instruction - LLM can see the history already { role: "user", content: COMPRESSION_PROMPT, system_injected: true } end |
#parse_compressed_result(result, chunk_path: nil) ⇒ Object
112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'lib/clacky/agent/message_compressor.rb', line 112 def parse_compressed_result(result, chunk_path: nil) # Return the compressed result as a single assistant message # Keep the <analysis> or <summary> tags as they provide semantic context content = result.to_s.strip if content.empty? [] else # Inject chunk anchor so AI knows where to find original conversation if chunk_path anchor = "\n\n---\n📁 **Original conversation archived at:** `#{chunk_path}`\n" \ "_Use `file_reader` tool to recall details from this chunk._" content = content + anchor end [{ role: "assistant", content: content, compressed_summary: true, chunk_path: chunk_path }] end end |
#rebuild_with_compression(compressed_content, original_messages:, recent_messages:, chunk_path: nil) ⇒ Array<Hash>
Parse LLM response and rebuild message list with compression
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
# File 'lib/clacky/agent/message_compressor.rb', line 90 def rebuild_with_compression(compressed_content, original_messages:, recent_messages:, chunk_path: nil) # Find and preserve system message system_msg = .find { |m| m[:role] == "system" } # Parse the compressed result = parse_compressed_result(compressed_content, chunk_path: chunk_path) # If parsing fails or returns empty, raise error if .nil? || .empty? raise "LLM compression failed: unable to parse compressed messages" end # Return system message + compressed messages + recent messages. # Strip any system messages from recent_messages as a safety net — # get_recent_messages_with_tool_pairs already excludes them, but this # guard ensures we never end up with duplicate system prompts even if # the caller passes an unfiltered list. safe_recent = .reject { |m| m[:role] == "system" } [system_msg, *, *safe_recent].compact end |