Class: Rubino::Agent::DegenerateResponseRecovery

Inherits:
Object
  • Object
show all
Defined in:
lib/rubino/agent/degenerate_recovery.rb

Overview

The degenerate-response recovery ladder — a faithful, rung-by-rung port of the ‘if not agent._has_content_after_think_block(final_response):` block in the reference conversation loop.

This is the load-bearing machinery that cures MiniMax’s “completed but empty” / thinking-only responses: a structurally-valid text response whose visible content is empty once the <think> reasoning is stripped. Rather than surfacing that as a finished (empty) turn, the ladder walks seven rungs IN ORDER, each a cheaper-or-smarter recovery than giving up:

1. partial-stream recovery   — content already streamed to the user before
                               the turn went degenerate? Use it.
2. prior-turn content        — the previous turn already delivered a real
                               answer alongside HOUSEKEEPING tools? Reuse it.
3. post-tool empty nudge     — empty right after a tool round? Append a
                               user-level "continue" hint and re-issue.
4. thinking-only prefill ×2  — the model reasoned (<think>) but never spoke?
                               Re-issue the SAME request with an assistant
                               PREFILL seed so it continues into visible
                               text. THE key MiniMax cure.
5. empty-content retry ×3    — truly empty (no text, no reasoning)? Retry.
6. empty → fallback          — retries exhausted? Hand to FallbackChain.
                               NOT BUILT — Slice 7 seam, falls through.
7. terminal                  — still stuck? Raise EmptyModelResponseError.
                               (We DROP the reference "(empty)" sentinel-replay
                               machinery and raise,
                               so Run::Executor maps it to FAILED, never
                               completed-but-empty.)

OWNS the two per-turn counters the reference keeps on the agent — prefill attempts (≤2) and empty-content retries (≤3). A fresh instance is built per model call (per ModelCallRunner#call!), so the counters reset exactly where the reference resets them to 0 on a successful content turn.

The ladder needs a little turn state the bare AdapterResponse does not carry (what streamed before the drop, the prior assistant turn, whether a tool round just ran). That is threaded in via RecoveryState, NOT re-derived here.

Defined Under Namespace

Classes: Directive, RecoveryState

Constant Summary collapse

DEFAULT_PREFILL_MAX =
2
DEFAULT_EMPTY_MAX =
3
NUDGE_TEXT =

The user-level hint appended after an empty post-tool turn (rung 3), verbatim from the reference implementation.

"You just executed tool calls but returned an empty response. " \
"Please process the tool results above and continue with the task."

Instance Method Summary collapse

Constructor Details

#initialize(validator: ResponseValidator.new, ui: nil, prefill_max: DEFAULT_PREFILL_MAX, empty_max: DEFAULT_EMPTY_MAX) ⇒ DegenerateResponseRecovery

Returns a new instance of DegenerateResponseRecovery.



83
84
85
86
87
88
89
90
91
92
93
# File 'lib/rubino/agent/degenerate_recovery.rb', line 83

def initialize(validator: ResponseValidator.new, ui: nil,
               prefill_max: DEFAULT_PREFILL_MAX, empty_max: DEFAULT_EMPTY_MAX)
  @validator    = validator
  @ui           = ui
  @prefill_max  = prefill_max
  @empty_max    = empty_max
  @prefill_attempts = 0
  @empty_attempts   = 0
  # _post_tool_empty_retried — the nudge fires at most once per turn.
  @nudged = false
end

Instance Method Details

#recover(state) ⇒ Object

Walk the ladder for one degenerate response and return a Directive. Mirrors the reference conversation loop rung for rung, in order.



97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
# File 'lib/rubino/agent/degenerate_recovery.rb', line 97

def recover(state)
  # ── Rung 1: partial-stream recovery ──────────────────
  # If real content was streamed to the user before the turn came back
  # degenerate, deliver it instead of wasting calls on retries.
  if content_after_think?(state.streamed_text)
    note("↻ Stream interrupted — using delivered content as final response")
    return Directive.new(kind: :use, content: strip_think(state.streamed_text))
  end

  # ── Rung 2: prior-turn content fallback ──────────────
  # The previous turn already delivered a real answer alongside
  # HOUSEKEEPING-only tools; the model has nothing more to say. Reuse it
  # rather than retrying. Guarded on all-housekeeping so mid-task
  # narration ("I'll scan the directory…") falls through to the nudge.
  if state.prior_turn_content && state.prior_tools_all_housekeeping
    note("↻ Empty response after tool calls — using earlier content as final answer")
    return Directive.new(kind: :use, content: strip_think(state.prior_turn_content))
  end

  has_inline_thinking = inline_thinking?(state.response)

  # ── Rung 3: post-tool empty nudge ────────────────────
  # Empty right after a tool round (and NOT a thinking-only response —
  # that routes to prefill below). Append the empty assistant turn then a
  # user-level nudge so the sequence stays valid (tool → assistant →
  # user), and re-issue. Fires at most once per turn.
  if prior_was_tool?(state.messages) && !@nudged && !has_inline_thinking
    @nudged = true
    note("⚠️ Model returned empty after tool calls — nudging to continue")
    append_nudge!(state.messages, state.response)
    return Directive.new(kind: :nudge)
  end

  # ── Rung 4: thinking-only prefill-to-continue ×2 ─────
  # The model produced reasoning (structured thinking field OR inline
  # <think>) but no visible text. Re-issue the SAME request seeded with an
  # assistant PREFILL so the model continues from its own reasoning into
  # the visible answer. THE MiniMax cure.
  if has_structured?(state.response) && @prefill_attempts < @prefill_max
    @prefill_attempts += 1
    note("↻ Thinking-only response — prefilling to continue " \
         "(#{@prefill_attempts}/#{@prefill_max})")
    return Directive.new(kind: :prefill, seed: prefill_seed(state.response))
  end

  # ── Rung 5: empty-content retry ×3 ───────────────────
  # Truly empty (nothing usable once <think> is stripped), OR a reasoning
  # model that has now exhausted its prefill attempts. Plain retry.
  truly_empty       = strip_think(state.response.content).empty?
  prefill_exhausted = has_structured?(state.response) && @prefill_attempts >= @prefill_max
  if truly_empty && (!has_structured?(state.response) || prefill_exhausted) &&
     @empty_attempts < @empty_max
    @empty_attempts += 1
    note("⚠️ Empty response from model — retrying (#{@empty_attempts}/#{@empty_max})")
    return Directive.new(kind: :retry, attempt: @empty_attempts)
  end

  # ── Rung 6: empty → fallback ─────────────────────────
  # SLICE-7 seam. The reference here tries _try_activate_fallback() and, on a
  # successful switch, resets _empty_content_retries to 0 and continues on
  # the new provider. FallbackChain is not built yet (Slice 7), so there
  # is no provider to switch to — fall straight through to rung 7. When
  # FallbackChain lands, attempt the switch here and return :retry on
  # success (zeroing @empty_attempts).

  # ── Rung 7: terminal ─────────────────────────────────
  # Exhausted every rung. We DROP the reference "(empty)" sentinel-replay
  # machinery: the runner raises EmptyModelResponseError so the
  # run is marked FAILED, never completed-but-empty.
  Directive.new(kind: :raise)
end