Class: Phronomy::Agent::Base

Inherits:
Object
  • Object
show all
Includes:
Concerns::BeforeCompletion, Concerns::ErrorTranslation, Concerns::Guardrailable, Concerns::Retryable, Concerns::Suspendable, Runnable
Defined in:
lib/phronomy/agent/base.rb

Overview

Base class for all Phronomy agents.

Subclass this to create a conversational agent powered by an LLM. DSL class methods configure the model, instructions, tools, memory, and retry behaviour. Instance methods handle invocation.

Examples:

Minimal agent

class GreetingAgent < Phronomy::Agent::Base
  model "gpt-4o-mini"
  instructions "You are a friendly greeter."
end
result = GreetingAgent.new.invoke("Hello!")
puts result[:output]

Agent with tools

class ResearchAgent < Phronomy::Agent::Base
  model "gpt-4o"
  instructions "You are a research assistant."
  tools WebSearchTool, CalculatorTool
  max_iterations 15
end

Direct Known Subclasses

Orchestrator, ReactAgent

Instance Attribute Summary

Attributes included from Concerns::BeforeCompletion

#before_completion

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Concerns::Suspendable

#on_approval_required, #resume

Methods included from Concerns::BeforeCompletion

included

Methods included from Concerns::Guardrailable

#add_input_guardrail, #add_output_guardrail

Methods included from Concerns::Retryable

included

Methods included from Runnable

#batch, #trace

Class Method Details

._on_compact_callbackProc?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns:

  • (Proc, nil)


379
380
381
# File 'lib/phronomy/agent/base.rb', line 379

def _on_compact_callback
  @on_compact_callback
end

._on_compaction_trigger_callbackProc?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns:

  • (Proc, nil)


355
356
357
# File 'lib/phronomy/agent/base.rb', line 355

def _on_compaction_trigger_callback
  @on_compaction_trigger_callback
end

._on_trim_callbackProc?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns:

  • (Proc, nil)


330
331
332
# File 'lib/phronomy/agent/base.rb', line 330

def _on_trim_callback
  @on_trim_callback
end

.cache_instructions(enabled = nil) ⇒ Object

When enabled, attaches Anthropic prompt-cache markers to the system message so that the fixed instructions are served from cache on subsequent turns, reducing input-token costs.

Only has an effect when the agent also declares provider :anthropic. The cache_control field is provider-specific (the format differs between Anthropic direct, Bedrock, etc.), so the agent must explicitly declare its provider via the DSL rather than having it inferred from the model name.

Examples:

class MyAgent < Phronomy::Agent::Base
  provider :anthropic
  cache_instructions true
end


399
400
401
402
403
404
405
# File 'lib/phronomy/agent/base.rb', line 399

def cache_instructions(enabled = nil)
  if enabled.nil?
    @cache_instructions
  else
    @cache_instructions = enabled
  end
end

.context_overhead(val = nil) ⇒ Object

Tokens reserved for the system prompt + tool definitions overhead. Subtract this from the context window before computing the memory budget.

Examples:

class MyAgent < Phronomy::Agent::Base
  context_overhead 500
end


449
450
451
452
453
454
455
# File 'lib/phronomy/agent/base.rb', line 449

def context_overhead(val = nil)
  if val.nil?
    @context_overhead || 0
  else
    @context_overhead = val.to_i
  end
end

.context_window(val = nil) ⇒ Object

Overrides the context window size used for token budget calculations. When set, this value takes precedence over the RubyLLM model registry, which is useful for locally-hosted models (e.g. LM Studio) where the actually-loaded context length may differ from the catalogue value.

Examples:

class MyAgent < Phronomy::Agent::Base
  context_window 4096
end


433
434
435
436
437
438
439
# File 'lib/phronomy/agent/base.rb', line 433

def context_window(val = nil)
  if val.nil?
    @context_window
  else
    @context_window = val.to_i
  end
end

.instructions(text = nil) { ... } ⇒ String, ...

Sets or reads the system instructions for this agent. Accepts a String, a PromptTemplate, or a block (Proc). When used as a reader (no argument, no block), returns the stored value.

Examples:

String instructions

class MyAgent < Phronomy::Agent::Base
  instructions "You are a helpful assistant."
end

Block instructions

class MyAgent < Phronomy::Agent::Base
  instructions { |input| "Answer in #{input[:lang]}." }
end

Parameters:

Yields:

  • optionally provide instructions as a block

Returns:



79
80
81
82
83
84
85
86
# File 'lib/phronomy/agent/base.rb', line 79

def instructions(text = nil, &block)
  if text || block_given?
    @instructions = text || block
  else
    return @instructions if instance_variable_defined?(:@instructions)
    superclass.respond_to?(:instructions) ? superclass.instructions : nil
  end
end

.invoke_timeout(val = nil) ⇒ Numeric?

Sets or reads the per-invocation timeout (in seconds) for EventLoop-mode agent calls. When set, +invoke+ raises TimeoutError if the agent does not finish within the given number of seconds.

Has no effect when EventLoop mode is disabled (direct invoke path). Defaults to +nil+ (no timeout). Inherited by subclasses; the most-specific definition wins.

Note: +invoke_timeout+ is a wait timeout, not a cancellation. When the timeout fires, +Phronomy::TimeoutError+ is raised to the caller, but the background agent thread and any in-flight LLM or tool calls are not interrupted — they continue running until they complete naturally. The agent therefore keeps consuming threads, memory, and external API credits after the caller has already received the error. True cancellation is not yet supported.

Examples:

class MyAgent < Phronomy::Agent::Base
  invoke_timeout 30
end

Parameters:

  • val (Numeric, nil) (defaults to: nil)

Returns:

  • (Numeric, nil)


244
245
246
247
248
249
250
251
252
253
254
255
# File 'lib/phronomy/agent/base.rb', line 244

def invoke_timeout(val = nil)
  if val.nil?
    return @invoke_timeout if defined?(@invoke_timeout)
    superclass.respond_to?(:invoke_timeout) ? superclass.invoke_timeout : nil
  else
    unless val.is_a?(Numeric) && val > 0
      raise ArgumentError,
        "invoke_timeout must be a positive number, got #{val.inspect}"
    end
    @invoke_timeout = val
  end
end

.max_iterations(val = nil) ⇒ Integer

Sets or reads the maximum number of LLM call cycles for ReAct agents. Each tool call and follow-up counts as one iteration. Defaults to 10.

Examples:

class MyAgent < Phronomy::Agent::Base
  max_iterations 5
end

Parameters:

  • val (Integer, nil) (defaults to: nil)

Returns:

  • (Integer)


186
187
188
189
190
191
192
# File 'lib/phronomy/agent/base.rb', line 186

def max_iterations(val = nil)
  if val
    @max_iterations = val
  else
    @max_iterations || 10
  end
end

.max_output_tokens(val = nil) ⇒ Object

Tokens to reserve for the model's output. When nil, the model's max_output_tokens from the registry is used.

Examples:

class MyAgent < Phronomy::Agent::Base
  max_output_tokens 4096
end


415
416
417
418
419
420
421
# File 'lib/phronomy/agent/base.rb', line 415

def max_output_tokens(val = nil)
  if val.nil?
    @max_output_tokens
  else
    @max_output_tokens = val.to_i
  end
end

.max_parallel_tools(val = nil) ⇒ Integer

Sets or reads the maximum number of tool calls executed concurrently when the LLM returns multiple tool calls in a single response (ParallelToolChat mode, active inside an AgentFSM IO thread).

Defaults to 10. Set to 1 to force sequential execution. Inherited by subclasses; the most-specific definition wins.

Examples:

class MyAgent < Phronomy::Agent::Base
  max_parallel_tools 4
end

Parameters:

  • val (Integer, nil) (defaults to: nil)

Returns:

  • (Integer)


208
209
210
211
212
213
214
215
216
217
218
219
# File 'lib/phronomy/agent/base.rb', line 208

def max_parallel_tools(val = nil)
  if val.nil?
    @max_parallel_tools ||
      (superclass.respond_to?(:max_parallel_tools) ? superclass.max_parallel_tools : 10)
  else
    unless val.is_a?(Integer) && val >= 1
      raise ArgumentError,
        "max_parallel_tools must be a positive Integer (>= 1), got #{val.inspect}"
    end
    @max_parallel_tools = val
  end
end

.model(name = nil) ⇒ String?

Sets or reads the LLM model identifier for this agent. When called without an argument, returns the stored model or the global default from Phronomy.configuration.

Examples:

class MyAgent < Phronomy::Agent::Base
  model "gpt-4o"
end

Parameters:

  • name (String, nil) (defaults to: nil)

    model identifier (e.g. "gpt-4o", "claude-3-5-sonnet")

Returns:

  • (String, nil)

    the model name when used as a reader



55
56
57
58
59
60
61
# File 'lib/phronomy/agent/base.rb', line 55

def model(name = nil)
  if name
    @model = name
  else
    @model || Phronomy.configuration.default_model
  end
end

.on_compact {|ctx| ... } ⇒ Object

Registers a callback that performs the actual compaction when the +on_compaction_trigger+ callback fires. The block receives a Context::CompactionContext and should call +ctx.compact+ to specify which messages to summarise.

Examples:

Replace the first 4 messages with a short summary

on_compact do |ctx|
  ctx.compact(0..3) do |elements|
    texts = elements.map { |e| e[:message].content }.join(" | ")
    "Earlier conversation summary: #{texts}"
  end
end

Yields:

  • (ctx)

    Phronomy::Context::CompactionContext



373
374
375
# File 'lib/phronomy/agent/base.rb', line 373

def on_compact(&block)
  @on_compact_callback = block
end

.on_compaction_trigger {|ctx| ... } ⇒ Boolean

Registers a callback that decides whether compaction should run. Evaluated before every LLM call (after on_trim). If the block returns truthy AND an +on_compact+ callback is also registered, the compact pipeline is executed.

The block receives a read-only Context::TriggerContext.

Examples:

Trigger when messages exceed 70% of token budget

on_compaction_trigger do |ctx|
  limit = ctx.budget&.available(used: 0) || Float::INFINITY
  ctx.total_tokens > limit * 0.7
end

Yields:

  • (ctx)

    Phronomy::Context::TriggerContext

Returns:

  • (Boolean)

    truthy → run on_compact; falsy → skip



349
350
351
# File 'lib/phronomy/agent/base.rb', line 349

def on_compaction_trigger(&block)
  @on_compaction_trigger_callback = block
end

.on_trim {|ctx| ... } ⇒ Object

Registers a callback that is invoked before every LLM call so the application can remove stale or irrelevant messages from the conversation history.

The block receives a Context::TrimContext and may call +ctx.remove(seqs)+ to drop messages by seq number. Changes affect only the current invocation; the underlying memory store is unchanged.

Examples:

Drop the oldest message when over 80% of budget is used

on_trim do |ctx|
  limit = ctx.budget&.available(used: 0) || Float::INFINITY
  ctx.remove(ctx.message_elements.first[:seq]) if ctx.total_tokens > limit * 0.8
end

Yields:

  • (ctx)

    Phronomy::Context::TrimContext



324
325
326
# File 'lib/phronomy/agent/base.rb', line 324

def on_trim(&block)
  @on_trim_callback = block
end

.provider(name = nil) ⇒ Symbol?

Sets or reads the LLM provider for this agent. Required when using a model not registered in RubyLLM's model registry (e.g. locally-hosted models via LM Studio or Ollama).

Examples:

class MyAgent < Phronomy::Agent::Base
  model "openai/gpt-oss-20b"
  provider :openai
end

Parameters:

  • name (Symbol, nil) (defaults to: nil)

    e.g. +:openai+, +:anthropic+, +:ollama+

Returns:

  • (Symbol, nil)


149
150
151
152
153
154
155
156
# File 'lib/phronomy/agent/base.rb', line 149

def provider(name = nil)
  if name
    @provider = name
  else
    return @provider if instance_variable_defined?(:@provider)
    superclass.respond_to?(:provider) ? superclass.provider : nil
  end
end

.static_knowledge(*sources) ⇒ Object

Registers one or more static knowledge sources on the agent class. Static source content is fetched and memoized at the class level the first time +invoke+ is called. The cache persists for the lifetime of the process; call static_knowledge_refresh! to force a reload.

Examples:

class PolicyAgent < Phronomy::Agent::Base
  static_knowledge Phronomy::KnowledgeSource::StaticKnowledge.new(POLICY_TEXT)
end

Parameters:



268
269
270
271
272
273
# File 'lib/phronomy/agent/base.rb', line 268

def static_knowledge(*sources)
  @static_knowledge_sources = sources.flatten
  # Invalidate the cached chunks so the new sources are fetched on
  # the next call to static_knowledge_chunks.
  @static_knowledge_chunks = nil
end

.static_knowledge_chunksArray<Hash>

Returns the fetched content from all static knowledge sources. Results are cached at the class level so that each source is fetched only once regardless of how many times the agent is invoked.

Returns:

  • (Array<Hash>)


287
288
289
290
291
# File 'lib/phronomy/agent/base.rb', line 287

def static_knowledge_chunks
  @static_knowledge_chunks ||= static_knowledge_sources.flat_map { |ks|
    ks.fetch(query: nil)
  }
end

.static_knowledge_refresh!nil

Clears the class-level knowledge cache so that the next +invoke+ call re-fetches content from all registered static knowledge sources.

Call this method when the underlying knowledge source has been updated at runtime (e.g. a file was rewritten, a DB record changed) and you want the agent to pick up the new content without restarting the process.

Examples:

Refresh after updating a knowledge file

MyAgent.static_knowledge_refresh!

Returns:

  • (nil)


305
306
307
# File 'lib/phronomy/agent/base.rb', line 305

def static_knowledge_refresh!
  @static_knowledge_chunks = nil
end

.static_knowledge_sourcesArray<Phronomy::KnowledgeSource::Base>

Returns the registered static knowledge sources.



278
279
280
# File 'lib/phronomy/agent/base.rb', line 278

def static_knowledge_sources
  @static_knowledge_sources || []
end

.temperature(val = nil) ⇒ Float?

Sets or reads the sampling temperature sent to the LLM. When nil, the provider's default is used.

Examples:

class MyAgent < Phronomy::Agent::Base
  temperature 0.2
end

Parameters:

  • val (Float, nil) (defaults to: nil)

    temperature (0.0 to 2.0 depending on provider)

Returns:

  • (Float, nil)


168
169
170
171
172
173
174
# File 'lib/phronomy/agent/base.rb', line 168

def temperature(val = nil)
  if val
    @temperature = val
  else
    @temperature
  end
end

.tool_aliasesHash{Class => String}

Returns the alias map registered via the hash form of .tools. Merges parent class aliases so subclasses inherit their parent's mappings. Subclass-specific aliases take precedence over parent aliases.

Returns:

  • (Hash{Class => String})


128
129
130
131
132
133
134
135
# File 'lib/phronomy/agent/base.rb', line 128

def tool_aliases
  own = @tool_aliases || {}
  if superclass.respond_to?(:tool_aliases)
    superclass.tool_aliases.merge(own)
  else
    own
  end
end

.tools(*args) ⇒ Object

Registers tool classes for this agent.

Accepts either a splat of classes (backward-compatible) or a Hash mapping each class to an explicit alias name (String) or nil (use tool's own name). The alias form is useful when two tools share the same auto-generated name (e.g. two SearchTool classes from different modules).

Examples:

Splat form (no alias)

tools WeatherTool, TimeTool

Hash form (with optional per-tool alias)

tools(
  Weather::SearchTool => "weather_search",
  Places::SearchTool  => "places_search",
  CurrentTimeTool     => nil
)


105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# File 'lib/phronomy/agent/base.rb', line 105

def tools(*args)
  if args.empty?
    if instance_variable_defined?(:@tools)
      return @tools
    end
    return superclass.respond_to?(:tools) ? superclass.tools : []
  end

  if args.length == 1 && args.first.is_a?(Hash)
    hash = args.first
    @tools = hash.keys
    @tool_aliases = hash.transform_values { |v| v&.to_s }.reject { |_, v| v.nil? }
  else
    @tools = args
    @tool_aliases = {}
  end
end

Instance Method Details

#_add_handoff_tool(tool_class) ⇒ self

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Registers an anonymous handoff tool class on this agent instance. Called by Runner during construction when routes are configured.

Parameters:

Returns:

  • (self)


463
464
465
466
467
# File 'lib/phronomy/agent/base.rb', line 463

def _add_handoff_tool(tool_class)
  @_handoff_tools ||= []
  @_handoff_tools << tool_class
  self
end

#_handoff_toolsArray<Class>

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns handoff tool classes registered on this instance by Runner.

Returns:

  • (Array<Class>)


472
473
474
# File 'lib/phronomy/agent/base.rb', line 472

def _handoff_tools
  @_handoff_tools || []
end

#context_version_cacheObject

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns the Context::ContextVersionCache built during the most recent #invoke call on this agent instance. The thread-local cache entry is cleaned up in the +ensure+ block of #invoke, but a reference is kept in +@last_context_version_cache+ so callers can inspect it after invoke returns.

NOTE: Not thread-safe. When the same Agent instance is used concurrently, +@last_context_version_cache+ reflects the most recent +invoke+ on any thread. For per-invocation isolation, use a separate Agent instance per thread.



641
642
643
# File 'lib/phronomy/agent/base.rb', line 641

def context_version_cache
  @last_context_version_cache
end

#invoke(input, messages: [], thread_id: nil, config: {}) ⇒ Hash

Invokes the agent with the given input and returns a result Hash. Applies the retry policy configured via retry_policy when transient errors occur. GuardrailError is never retried.

Examples:

Normal invocation

result = MyAgent.new.invoke("What is Ruby?")
puts result[:output]

Multi-turn conversation

result1 = agent.invoke("Hi, I'm Alice.")
result2 = agent.invoke("What's my name?", messages: result1[:messages])

Suspend / resume flow

result = agent.invoke("Perform task X")
if result[:suspended]
  result = agent.resume(result[:checkpoint], approved: true)
end
puts result[:output]

Parameters:

  • input (String, Hash)

    the user message; a Hash may supply +:message+, +:query+, or +:user+ as the text key, plus any template variables consumed by the configured instructions template.

  • messages (Array<RubyLLM::Message>) (defaults to: [])

    conversation history from a previous invocation. The application owns and persists this array; pass it on every turn to maintain multi-turn context.

  • thread_id (String, nil) (defaults to: nil)

    conversation thread identifier, forwarded to the compaction context when on_compact is configured.

  • config (Hash) (defaults to: {})

    additional runtime options: +:knowledge_sources+ (Array) — dynamic knowledge sources for this turn +:user_id+ (+String+, optional) — caller identity forwarded to the tracer +:session_id+ (+String+, optional) — session identity forwarded to the tracer

Returns:

  • (Hash)

    +{ output: String, messages: Array, usage: Phronomy::TokenUsage }+, or +{ output: nil, suspended: true, checkpoint: Phronomy::Agent::Checkpoint, messages: Array }+ when the invocation was suspended awaiting tool approval.

Raises:



509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
# File 'lib/phronomy/agent/base.rb', line 509

def invoke(input, messages: [], thread_id: nil, config: {})
  if Phronomy.configuration.event_loop
    # Protect against blocking the EventLoop thread itself.
    if Thread.current[:phronomy_event_loop_thread]
      raise Phronomy::Error,
        "Cannot call Agent#invoke (EventLoop mode) from within an EventLoop " \
        "entry action. Use agent.run_as_child(input, ctx: ctx) instead."
    end

    fsm = Agent::FSM.new(
      agent: self,
      input: input,
      messages: messages,
      thread_id: thread_id || SecureRandom.uuid,
      config: config
    )
    completion_queue = Phronomy::EventLoop.instance.register(fsm)
    timeout_sec = self.class.invoke_timeout
    result = if timeout_sec
      begin
        Timeout.timeout(timeout_sec) { completion_queue.pop }
      rescue Timeout::Error
        raise Phronomy::TimeoutError,
          "Agent #{self.class.name} invoke timed out after #{timeout_sec}s"
      end
    else
      completion_queue.pop
    end
    raise result if result.is_a?(Exception)
    result
  else
    _invoke_impl(input, messages: messages, thread_id: thread_id, config: config)
  end
ensure
  # Remove this agent's context cache entry from the current thread to
  # prevent unbounded growth of the thread-local hash in long-lived
  # processes (e.g. Rails servers).
  Thread.current[:phronomy_context_version_caches]&.delete(object_id)
end

#run_as_child(input, ctx:, messages: [], config: {}) {|Hash| ... } ⇒ nil

Registers this agent as a child AgentFSM inside the given Workflow context.

Use this method from a Workflow entry action (running on the EventLoop thread) instead of #invoke, which would raise a deadlock error because +invoke+ blocks on a +Thread::Queue+ when EventLoop mode is active.

The agent runs asynchronously in a background IO thread. When it finishes, the parent FSMSession receives a +:child_completed+ event whose payload is the result hash +{ output:, messages:, usage: }+. Declare an +on: :child_completed+ transition in your Workflow to advance to the next state.

An optional block may be provided to write the result back into the parent WorkflowContext before the +:child_completed+ event is dispatched. +Thread::Queue+ provides the happens-before guarantee \u2014 no Mutex is needed.

Examples:

Without block (result available only as event payload)

entry :run_agent, ->(ctx) { MyAgent.new.run_as_child(ctx.query, ctx: ctx) }
transition from: :run_agent, on: :child_completed, to: :process_result

With block (writes result into context)

entry :run_agent, ->(ctx) {
  MyAgent.new.run_as_child(ctx.query, ctx: ctx) { |r| ctx.answer = r[:output] }
}
transition from: :run_agent, on: :child_completed, to: :process_result

Parameters:

  • input (String, Hash)

    user input passed to the agent

  • ctx (Object)

    a WorkflowContext that responds to +#thread_id+

  • messages (Array) (defaults to: [])

    prior conversation history

  • config (Hash) (defaults to: {})

    invocation config (forwarded to +_invoke_impl+)

Yields:

  • (Hash)

    result hash +{ output:, messages:, usage: }+ — called from the agent IO thread before +:child_completed+ is posted

Returns:

  • (nil)

    the caller must not wait on any return value; the result arrives as a +:child_completed+ event

Raises:



584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
# File 'lib/phronomy/agent/base.rb', line 584

def run_as_child(input, ctx:, messages: [], config: {}, &result_writer)
  unless Phronomy.configuration.event_loop
    raise Phronomy::Error,
      "run_as_child requires EventLoop mode. " \
      "Enable with: Phronomy.configure { |c| c.event_loop = true }"
  end

  fsm = Agent::FSM.new(
    agent: self,
    input: input,
    messages: messages,
    thread_id: "#{ctx.thread_id}_agent_#{SecureRandom.uuid}",
    config: config,
    parent_id: ctx.thread_id,
    result_writer: result_writer
  )
  Phronomy::EventLoop.instance.enqueue_child(fsm)
  nil
end

#stream(input, messages: [], thread_id: nil, config: {}) {|Phronomy::Agent::StreamEvent| ... } ⇒ Hash

Streaming version of #invoke. Yields StreamEvent objects as they are produced by the underlying LLM.

Events emitted (in order): :token — each content delta from the LLM :tool_call — when the LLM requests a tool (ReactAgent subclasses only) :tool_result — after a tool completes (ReactAgent subclasses only) :done — final event carrying output, messages, and usage :error — if an unrecoverable error occurs

Parameters:

  • input (String, Hash)

    same as #invoke

  • messages (Array<RubyLLM::Message>) (defaults to: [])

    same as #invoke

  • thread_id (String, nil) (defaults to: nil)

    same as #invoke

  • config (Hash) (defaults to: {})

    same as #invoke

Yields:

Returns:

  • (Hash)

    { output:, messages:, usage: } — same as #invoke



621
622
623
624
625
626
627
628
# File 'lib/phronomy/agent/base.rb', line 621

def stream(input, messages: [], thread_id: nil, config: {}, &block)
  return invoke(input, messages: messages, thread_id: thread_id, config: config) unless block

  _stream_impl(input, messages: messages, thread_id: thread_id, config: config, &block)
rescue => e
  block&.call(StreamEvent.new(type: :error, payload: {error: e}))
  raise
end