Class: RobotLab::HistoryCompressor

Inherits:
Object
  • Object
show all
Defined in:
lib/robot_lab/history_compressor.rb

Overview

Compresses a robot’s conversation history using TF-IDF relevance scoring.

Old conversation turns are tiered against the most recent context:

  • High relevance (score >= keep_threshold) → kept verbatim

  • Medium relevance (drop_threshold..keep_threshold) → summarized or dropped

  • Low relevance (score < drop_threshold) → dropped

System messages and tool call/result messages are always preserved. The most recent recent_turns user+assistant pairs are also always kept.

Requires the optional ‘classifier’ gem (~> 2.3).

Examples:

Basic usage from Robot

robot.compress_history(recent_turns: 3, keep_threshold: 0.6, drop_threshold: 0.2)

With LLM summarizer (separate robot)

summarizer_robot = RobotLab.build(name: "summarizer", system_prompt: "Summarize concisely.")
robot.compress_history(
  summarizer: ->(text) { summarizer_robot.run("One sentence: #{text}").reply }
)

Defined Under Namespace

Classes: SUMMARY_STRUCT

Constant Summary collapse

MIN_SCORE_LENGTH =

Minimum text length (characters) to score; shorter messages are kept as-is.

20

Instance Method Summary collapse

Constructor Details

#initialize(messages:, recent_turns:, keep_threshold:, drop_threshold:, summarizer:) ⇒ HistoryCompressor

Returns a new instance of HistoryCompressor.

Parameters:

  • messages (Array)

    full @chat.messages array

  • recent_turns (Integer)

    number of user+assistant turn pairs to protect

  • keep_threshold (Float)

    score >= this → keep verbatim

  • drop_threshold (Float)

    score < this → drop

  • summarizer (#call, nil)

    callable(text) -> String for medium-tier; nil means drop medium-tier

Raises:

  • (ArgumentError)

    if keep_threshold <= drop_threshold



46
47
48
49
50
51
52
53
54
55
56
57
# File 'lib/robot_lab/history_compressor.rb', line 46

def initialize(messages:, recent_turns:, keep_threshold:, drop_threshold:, summarizer:)
  if keep_threshold <= drop_threshold
    raise ArgumentError,
          "keep_threshold (#{keep_threshold}) must be greater than drop_threshold (#{drop_threshold})"
  end

  @messages        = messages
  @recent_turns    = recent_turns
  @keep_threshold  = keep_threshold
  @drop_threshold  = drop_threshold
  @summarizer      = summarizer
end

Instance Method Details

#callArray

Execute compression and return the new message array.

Returns:

  • (Array)

    compressed message array



62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# File 'lib/robot_lab/history_compressor.rb', line 62

def call
  return @messages if @messages.empty?

  # Classify each message index as pinned (always keep) or scorable
  pinned_indices   = []
  scorable_indices = []

  @messages.each_with_index do |msg, idx|
    if pinned_message?(msg)
      pinned_indices << idx
    else
      scorable_indices << idx
    end
  end

  # Nothing scorable, or everything fits inside the recent window: return as-is
  return @messages if scorable_indices.empty?
  return @messages if scorable_indices.size <= @recent_turns * 2

  recent_count        = @recent_turns * 2
  compressible        = scorable_indices[0..-(recent_count + 1)]
  recent              = scorable_indices[-recent_count..]

  return @messages if compressible.nil? || compressible.empty?

  # Build reference vector from the recent window using stemmed term frequencies.
  # Term frequencies (no IDF) are used because IDF on a topic-focused corpus
  # would suppress the very terms that indicate relevance to that topic.
  recent_texts = recent.filter_map { |i| extract_text(@messages[i]) }
                       .reject { |t| t.strip.length < MIN_SCORE_LENGTH }

  # No meaningful recent text → cannot score; return unchanged
  return @messages if recent_texts.empty?

  TextAnalysis.require_classifier!

  recent_vectors = recent_texts.map { |t| TextAnalysis.l2_normalize(t.word_hash) }
  reference      = mean_vector(recent_vectors)

  # Decide action for each compressible message
  actions = {}
  compressible.each do |idx|
    actions[idx] = score_action(reference, @messages[idx])
  end

  # Reconstruct the message array in original order
  build_result(actions)
end