Module: Octo::Agent::NextMessageSuggester

Included in:: Octo::Agent

Defined in:: lib/octo/agent/next_message_suggester.rb

Overview

Background “ghost text” prediction of the user’s next message.

Fired after each main-agent task completes. Generates one short phrase (the model’s best guess at what the user will type next) and pushes it to the UI via show_next_message_suggestion. The web UI renders it as the input box’s placeholder with Tab-to-accept; terminal / IM UIs are no-ops by default.

Design notes:

- NOT a forked subagent. Subagents clone history, run a full
  think/act/observe loop, and trigger hooks — overkill for "generate
  one line." We make a single +Client#send_messages+ call directly.
- Async (own thread) and fire-and-forget. +Agent#show_complete+ must
  never block on the suggestion call.
- Uses the provider's lite model when available (Claude → Haiku,
  DeepSeek pro → flash, ...), falling back to the current primary
  when no lite mapping exists.
- Reuses the main agent's system prompt + last few messages as the
  LLM input. When primary and lite share a provider, this lands on a
  warm prompt cache; otherwise the cost is small absolute (short
  prompt, ≤40 output tokens).
- Silent on any failure. A failed suggestion call never disturbs the
  user's actual task result.

Constant Summary collapse

MAX_SUGGESTION_CHARS = Max characters in a suggestion that we’ll forward to the UI. Anything longer is treated as a model misfire and dropped (it’s supposed to be a phrase, not a paragraph).

RECENT_HISTORY_LIMIT = How many recent message pairs to send as context. 4 is plenty for the model to read the situation; more just inflates the cache footprint.

SUGGESTION_MAX_TOKENS = Output budget — short phrases only.

Instance Method Summary collapse

#next_message_suggestion_enabled? ⇒ Boolean

Trigger predicate.
#run_next_message_suggestion! ⇒ Object

Spawn the suggestion call in a daemon thread.

Instance Method Details

#next_message_suggestion_enabled? ⇒ `Boolean`

Trigger predicate. Cheap; called on the agent thread.

Returns:

(Boolean)

# File 'lib/octo/agent/next_message_suggester.rb', line 42

def next_message_suggestion_enabled?
  return false unless @config.respond_to?(:next_message_suggestion_enabled)
  return false unless @config.next_message_suggestion_enabled
  return false if @is_subagent
  true
end

#run_next_message_suggestion! ⇒ `Object`

Spawn the suggestion call in a daemon thread. Returns immediately.

# File 'lib/octo/agent/next_message_suggester.rb', line 50

def run_next_message_suggestion!
  return unless next_message_suggestion_enabled?
  return unless @ui

  # Snapshot the agent state we need on the worker thread so we don't
  # race with the next user turn mutating @history / @todos in place.
  history_snapshot = recent_history_for_suggestion
  return if history_snapshot.empty?

  todos_snapshot = (@todos || []).map { |t| t.is_a?(Hash) ? t.dup : t }
  ui = @ui

  Thread.new do
    text = generate_next_message_suggestion(history_snapshot, todos_snapshot)
    ui.show_next_message_suggestion(text) if text && !text.empty?
  rescue StandardError => e
    Octo::Logger.warn(
      "next_message_suggestion.failed",
      session_id: @session_id,
      error_class: e.class.name,
      error_message: e.message
    )
  end
end