Class: Rubino::Context::Compressor

Inherits:
Object
  • Object
show all
Defined in:
lib/rubino/context/compressor.rb

Overview

Orchestrates context compaction: flush memory, split messages into head/middle/tail, generate summary, create child session.

Constant Summary collapse

INEFFECTIVE_SAVINGS_PCT =

Anti-thrashing back-off (#415a, ported from Hermes context_compressor.py should_compress). A session hovering right at the threshold re-pays a summary call every turn even though each pass only shaves a message or two. If the two most recent compactions in this session’s lineage each saved less than INEFFECTIVE_SAVINGS_PCT of their original tokens, skip auto-compaction until genuinely new work pushes savings back up (the user can still force /compact). Returns true when compaction should be SKIPPED.

0.10
INEFFECTIVE_STREAK =
2

Instance Method Summary collapse

Constructor Details

#initialize(session_id:, config: nil, db: nil) ⇒ Compressor

Returns a new instance of Compressor.



10
11
12
13
14
15
16
17
# File 'lib/rubino/context/compressor.rb', line 10

def initialize(session_id:, config: nil, db: nil)
  @session_id = session_id
  @config = config || Rubino.configuration
  @db = db || Rubino.database.db
  @message_store = Session::Store.new(db: @db)
  @session_repo = Session::Repository.new(db: @db)
  @summary_store = Session::SummaryStore.new(db: @db)
end

Instance Method Details

#compact!Object

Performs full compaction and returns metadata

Raises:



38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# File 'lib/rubino/context/compressor.rb', line 38

def compact!
  session = @session_repo.find(@session_id)
  raise CompactionError, "Session not found: #{@session_id}" unless session

  # Resolve a SHORT id to the FULL session id before any message lookup
  # (#352): #find prefix-matches "5aebd8ce" to the row, but
  # `for_session(short_id)` matches messages by EXACT session_id and so
  # returned 0 rows — compaction then short-circuited to no_op_result and
  # the CLI printed a fake "compacted · saved 0 tok" success. Pin every
  # downstream lookup (messages, summaries, lineage) to the resolved id.
  @session_id = session[:id]
  @session_model = session[:model]

  messages = @message_store.for_session(@session_id)
  return no_op_result(:too_few_messages) if messages.size < minimum_messages

  # Below the token threshold a summary call COSTS more than it saves: the
  # generated summary (budgeted up to compression.max_summary_tokens) is
  # routinely larger than the handful of middle messages it replaces, so a
  # forced compaction on a small session GROWS context instead of shrinking
  # it (the QA "/compact on a 872-tok session → 2736 tok" bug). Both paths
  # must clear the SAME gate the auto path uses (TokenBudget#needs_compaction?)
  # — without it the manual path silently summarized, inflated, AND forked.
  # Industry norm (Claude Code / Codex): /compact below threshold is a no-op.
  return no_op_result(:below_threshold) unless needs_compaction?(messages)

  # 1. Flush memory before compaction
  flush_memory!

  # 2. Split messages into head / middle / tail
  boundary = MessageBoundary.new(messages: messages, config: @config)
  head = boundary.head
  middle = boundary.middle
  tail = boundary.tail

  return no_op_result if middle.empty?

  # 3. Sanitize tool pairs in middle
  if @config.compression_preserve_tool_pairs?
    sanitizer = ToolPairSanitizer.new
    middle = sanitizer.sanitize(middle)
  end

  # saved_tokens reports what leaves the LIVE transcript, so measure it
  # on the pre-prune middle (the pruned copy feeds only the summarizer).
  middle_tokens = estimate_tokens(middle)

  # 3b. Cheap LLM-free pre-pass (#415d): dedupe + summarize old tool
  # results in the middle BEFORE the paid summary call, so raw tool
  # noise (file reads, terminal dumps) doesn't inflate the summarizer
  # prompt. The middle is summarized then discarded, so a lossy
  # representation here is safe.
  middle = ToolResultPruner.new.prune(middle)

  # 4. Load previous summary (capture id now, before the insert below
  #    overwrites "latest" — the lineage link must point at the prior row)
  previous = @summary_store.latest(@session_id)
  previous_summary = previous&.dig(:content)
  previous_summary_id = previous&.dig(:id)

  # 5. Generate new summary
  summary_builder = SummaryBuilder.new(session_id: @session_id)
  new_summary = summary_builder.build(
    messages: middle,
    previous_summary: previous_summary
  )

  # Steps 6-8 are the irreversible state mutation; commit them atomically.
  summary_id, child_session = commit_compaction!(
    session: session, head: head, tail: tail, messages: messages,
    new_summary: new_summary, previous_summary_id: previous_summary_id
  )

  {
    source_session_id: @session_id,
    target_session_id: child_session[:id],
    original_messages: messages.size,
    compacted_messages: head.size + tail.size + 1, # +1 for summary
    saved_tokens: middle_tokens,
    summary_id: summary_id
  }
end

#thrashing?Boolean

Returns:

  • (Boolean)


30
31
32
33
34
35
# File 'lib/rubino/context/compressor.rb', line 30

def thrashing?
  rows = recent_lineage_compactions(INEFFECTIVE_STREAK)
  return false if rows.size < INEFFECTIVE_STREAK

  rows.all? { |r| ineffective?(r) }
end