Class: Rubino::LLM::BedrockBearerClient

Inherits:

Object

Object
Rubino::LLM::BedrockBearerClient

show all

Defined in:: lib/rubino/llm/bedrock_bearer_client.rb

Overview

Direct Bedrock runtime client using Bearer token authentication. Used when BEDROCK_API_KEY is set without BEDROCK_SECRET_KEY. Calls the Bedrock Converse API with Authorization: Bearer header. Supports tool calls via the native Bedrock Converse toolConfig format.

Constant Summary collapse

BEDROCK_RUNTIME_HOST =

"bedrock-runtime.%s.amazonaws.com"

Instance Method Summary collapse

#chat(messages:, tools: nil) ⇒ Object

Sends a non-streaming chat request, returns AdapterResponse.
#initialize(api_key:, region:, model_id:, show_reasoning: false, event_bus: nil) ⇒ BedrockBearerClient constructor

A new instance of BedrockBearerClient.
#stream(messages:, tools: nil, &block) ⇒ Object

Sends a “streaming” chat request and returns an AdapterResponse, yielding chunk HASHES shaped exactly like every other adapter: { type: :content | :thinking, text: String, message_id: Integer }.

Constructor Details

#initialize(api_key:, region:, model_id:, show_reasoning: false, event_bus: nil) ⇒ `BedrockBearerClient`

Returns a new instance of BedrockBearerClient.

# File 'lib/rubino/llm/bedrock_bearer_client.rb', line 17

def initialize(api_key:, region:, model_id:, show_reasoning: false, event_bus: nil)
  @api_key        = api_key
  @region         = region
  @model_id       = model_id
  @host           = BEDROCK_RUNTIME_HOST % region
  @show_reasoning = show_reasoning
  @event_bus      = event_bus
end

Instance Method Details

#chat(messages:, tools: nil) ⇒ `Object`

Sends a non-streaming chat request, returns AdapterResponse

# File 'lib/rubino/llm/bedrock_bearer_client.rb', line 27

def chat(messages:, tools: nil)
  body     = build_body(messages, tools: tools)
  response = post("/model/#{URI.encode_uri_component(@model_id)}/converse", body)
  parse_response(response)
end

#stream(messages:, tools: nil, &block) ⇒ `Object`

Sends a “streaming” chat request and returns an AdapterResponse, yielding chunk HASHES shaped exactly like every other adapter:

{ type: :content | :thinking, text: String, message_id: Integer }

Real Bedrock ConverseStream (binary eventstream) is out of scope: bearer- token auth isn’t supported by ruby_llm’s SigV4 Bedrock provider, and this is a plain Net::HTTP transport. We buffer the non-streaming /converse response FULLY, then replay it through InlineThinkFilter in slices so the SHAPE matches the streaming contract (typed deltas, :thinking channel, a single content block id, an explicit MESSAGE_COMPLETED boundary). Only the token cadence is synthetic.

INVARIANT: we buffer the entire response BEFORE the first emit. That is what makes retrying this call (now in Agent::ModelCallRunner) safe — a transport error can only fire during post() (before any chunk reached the UI), never mid-replay, so a retry can’t double output.

# File 'lib/rubino/llm/bedrock_bearer_client.rb', line 49

def stream(messages:, tools: nil, &block)
  body = build_body(messages, tools: tools)
  data = post("/model/#{URI.encode_uri_component(@model_id)}/converse", body)

  # Single buffered content block ⇒ message_id is always 0. Mirrors the
  # 2-arg emit lambda RubyLLMAdapter feeds into InlineThinkFilter.feed/flush.
  emit = lambda do |type, text|
    return if text.nil? || text.empty?
    return if type == :thinking && !@show_reasoning

    block&.call({ type: type, text: text, message_id: 0 })
  end

  think_filter = InlineThinkFilter.new
  extract_text(data).chars.each_slice(5) do |slice|
    think_filter.feed(slice.join, &emit)
  end
  think_filter.flush(&emit)

  @event_bus&.emit(Interaction::Events::MESSAGE_COMPLETED, message_id: 0)

  parse_response(data)
end