Module: RubyLLM::Providers::Bedrock::Embeddings

Included in:
RubyLLM::Providers::Bedrock
Defined in:
lib/legion/llm/call/bedrock_embeddings.rb

Overview

Embeddings methods for AWS Bedrock via InvokeModel.

Public methods are instance methods (not ‘module_function`) so the `include Embeddings` at the end of the class body properly overrides `Provider#embed` via Ruby’s method-resolution order.

Constant Summary collapse

TITAN_V2_PREFIX =
'amazon.titan-embed-text-v2'
TITAN_V1_PREFIX =
'amazon.titan-embed-text-v1'
COHERE_PREFIX =
'cohere.embed'
TITAN_ALLOWED_DIMENSIONS =
[256, 512, 1024].freeze
TITAN_MAX_INPUT_BYTES =

~8k tokens; Titan rejects larger with 400 (and still bills)

45_000
COHERE_MAX_INPUT_BYTES =

Cohere Embed v3 per-text byte budget

8_192
COHERE_MAX_TEXTS =

Cohere Embed v3 batch limit

96
MODEL_ID_PATTERN =

Bedrock model IDs use only alphanumerics, ‘.`, `-`, and `:` (e.g. `amazon.titan-embed-text-v2:0`, `cohere.embed-english-v3`, `us.anthropic.claude-sonnet-4-6-v1`). Slashes and `..` are rejected to block path-injection into the `/model/<id>/invoke` URL.

/\A[a-zA-Z0-9.\-:]+\z/

Instance Method Summary collapse

Instance Method Details

#embed(text, model:, dimensions:) ⇒ RubyLLM::Embedding

Override the base ‘embed` method so signing headers are applied.

The parent ‘Provider#embed` calls `@connection.post(url, payload)` directly, which would skip both bearer-token and SigV4 auth for Bedrock. We go through `invoke_embedding`, which mirrors `signed_post` but parses responses with `parse_embedding_response` (not `parse_completion_response`).

Titan accepts a single text per invocation. When an Array is passed to a Titan model, we iterate via ‘embed_titan_batch`, which traps per-element failures so one 429 mid-batch does not lose preceding successes.

Parameters:

  • text (String, Array<String>)
  • model (String)
  • dimensions (Integer, nil)

Returns:

  • (RubyLLM::Embedding)


143
144
145
146
147
148
149
150
151
# File 'lib/legion/llm/call/bedrock_embeddings.rb', line 143

def embed(text, model:, dimensions:)
  return embed_titan_batch(text, model: model, dimensions: dimensions) \
    if text.is_a?(Array) && !model.to_s.start_with?(COHERE_PREFIX)

  payload  = render_embedding_payload(text, model: model, dimensions: dimensions)
  url      = embedding_url(model: model)
  response = invoke_embedding(url, payload)
  parse_embedding_response(response, model: model, text: text)
end

#embedding_url(model:) ⇒ String

Returns InvokeModel URL path.

Parameters:

  • model (String, Symbol)

    Bedrock model id

Returns:

  • (String)

    InvokeModel URL path

Raises:

  • (RubyLLM::Error)

    if model id contains unsafe characters



69
70
71
72
73
74
# File 'lib/legion/llm/call/bedrock_embeddings.rb', line 69

def embedding_url(model:)
  raise RubyLLM::Error.new(nil, "Invalid Bedrock model id: #{model.inspect}") \
    unless model.to_s.match?(MODEL_ID_PATTERN)

  "/model/#{model}/invoke"
end

#parse_embedding_response(response, model:, text:) ⇒ RubyLLM::Embedding

Parameters:

  • response (Faraday::Response)
  • model (String)
  • text (String, Array<String>)

    original input (used for shape decisions)

Returns:

  • (RubyLLM::Embedding)

Raises:

  • (RubyLLM::Error)

    if the response carried no vector



105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# File 'lib/legion/llm/call/bedrock_embeddings.rb', line 105

def parse_embedding_response(response, model:, text:)
  body = response.body
  body = try_parse_json(body) if body.is_a?(String)

  vectors =
    if model.to_s.start_with?(COHERE_PREFIX)
      Array(body['embeddings'])
    else
      # Titan single-text response: the single vector lives in :embedding.
      # Batch callers are handled in `embed` via iteration.
      [body['embedding']].compact
    end

  raise RubyLLM::Error.new(response, "Empty embedding response for model #{model}") if vectors.empty?

  vectors = vectors.first if vectors.length == 1 && !text.is_a?(Array)
  input_tokens = body['inputTextTokenCount'] ||
                 body.dig('meta', 'billed_units', 'input_tokens') ||
                 0

  RubyLLM::Embedding.new(vectors: vectors, model: model, input_tokens: input_tokens)
end

#render_embedding_payload(text, model:, dimensions:) ⇒ Hash

Returns JSON-serializable request payload.

Parameters:

  • text (String, Array<String>)
  • model (String)

    Bedrock embedding model id

  • dimensions (Integer, nil)

    Titan v2 only; one of 512, 1024

Returns:

  • (Hash)

    JSON-serializable request payload

Raises:

  • (RubyLLM::Error)

    on unsupported model, oversize input, or invalid dimensions



81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# File 'lib/legion/llm/call/bedrock_embeddings.rb', line 81

def render_embedding_payload(text, model:, dimensions:)
  model_str = model.to_s

  if model_str.start_with?(TITAN_V2_PREFIX)
    titan_v2_payload(text, dimensions: dimensions)
  elsif model_str.start_with?(TITAN_V1_PREFIX)
    titan_v1_payload(text)
  elsif model_str.start_with?(COHERE_PREFIX)
    cohere_payload(text)
  else
    raise RubyLLM::Error.new(
      nil,
      "Bedrock model '#{model}' is not supported for embeddings. " \
      'Supported prefixes: amazon.titan-embed-text-v1, ' \
      'amazon.titan-embed-text-v2, cohere.embed-*.'
    )
  end
end