Class: Brute::Middleware::LLMCall

Inherits:
Object
  • Object
show all
Defined in:
lib/brute/middleware/llm_call.rb

Overview

The terminal “app” in the pipeline — performs the actual LLM call.

Builds a fresh LLM::Context per call from env, makes the call, extracts new messages back into env, and stashes pending functions in env.

When streaming, on_content fires incrementally via AgentStream. When not streaming, fires on_content post-hoc with the full text.

Instance Method Summary collapse

Instance Method Details

#call(env) ⇒ Object



18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/brute/middleware/llm_call.rb', line 18

def call(env)
  ctx = build_context(env)

  # Load existing conversation history into the ephemeral context
  ctx.messages.concat(env[:messages])

  response = ctx.talk(env[:input])

  # Extract new messages appended by talk() and store them
  new_messages = ctx.messages.to_a.drop(env[:messages].size)
  env[:messages].concat(new_messages)

  # Stash pending functions for the agent loop
  env[:pending_functions] = ctx.functions.to_a

  # Only fire on_content post-hoc when NOT streaming
  # (streaming delivers chunks incrementally via AgentStream)
  unless env[:streaming]
    if (cb = env.dig(:callbacks, :on_content)) && response
      text = safe_content(response)
      cb.call(text) if text
    end
  end

  response
end