Module: RubyLLM::Providers::Bedrock::Embeddings
- Included in:
- RubyLLM::Providers::Bedrock
- Defined in:
- lib/legion/llm/call/bedrock_embeddings.rb
Overview
Embeddings methods for AWS Bedrock via InvokeModel.
Public methods are instance methods (not ‘module_function`) so the `include Embeddings` at the end of the class body properly overrides `Provider#embed` via Ruby’s method-resolution order.
Constant Summary collapse
- TITAN_V2_PREFIX =
'amazon.titan-embed-text-v2'- TITAN_V1_PREFIX =
'amazon.titan-embed-text-v1'- COHERE_PREFIX =
'cohere.embed'- TITAN_ALLOWED_DIMENSIONS =
[256, 512, 1024].freeze
- TITAN_MAX_INPUT_BYTES =
~8k tokens; Titan rejects larger with 400 (and still bills)
45_000- COHERE_MAX_INPUT_BYTES =
Cohere Embed v3 per-text byte budget
8_192- COHERE_MAX_TEXTS =
Cohere Embed v3 batch limit
96- MODEL_ID_PATTERN =
Bedrock model IDs use only alphanumerics, ‘.`, `-`, and `:` (e.g. `amazon.titan-embed-text-v2:0`, `cohere.embed-english-v3`, `us.anthropic.claude-sonnet-4-6-v1`). Slashes and `..` are rejected to block path-injection into the `/model/<id>/invoke` URL.
/\A[a-zA-Z0-9.\-:]+\z/
Instance Method Summary collapse
-
#embed(text, model:, dimensions:) ⇒ RubyLLM::Embedding
Override the base ‘embed` method so signing headers are applied.
-
#embedding_url(model:) ⇒ String
InvokeModel URL path.
- #parse_embedding_response(response, model:, text:) ⇒ RubyLLM::Embedding
-
#render_embedding_payload(text, model:, dimensions:) ⇒ Hash
JSON-serializable request payload.
Instance Method Details
#embed(text, model:, dimensions:) ⇒ RubyLLM::Embedding
Override the base ‘embed` method so signing headers are applied.
The parent ‘Provider#embed` calls `@connection.post(url, payload)` directly, which would skip both bearer-token and SigV4 auth for Bedrock. We go through `invoke_embedding`, which mirrors `signed_post` but parses responses with `parse_embedding_response` (not `parse_completion_response`).
Titan accepts a single text per invocation. When an Array is passed to a Titan model, we iterate via ‘embed_titan_batch`, which traps per-element failures so one 429 mid-batch does not lose preceding successes.
143 144 145 146 147 148 149 150 151 |
# File 'lib/legion/llm/call/bedrock_embeddings.rb', line 143 def (text, model:, dimensions:) return (text, model: model, dimensions: dimensions) \ if text.is_a?(Array) && !model.to_s.start_with?(COHERE_PREFIX) payload = (text, model: model, dimensions: dimensions) url = (model: model) response = (url, payload) (response, model: model, text: text) end |
#embedding_url(model:) ⇒ String
Returns InvokeModel URL path.
69 70 71 72 73 74 |
# File 'lib/legion/llm/call/bedrock_embeddings.rb', line 69 def (model:) raise RubyLLM::Error.new(nil, "Invalid Bedrock model id: #{model.inspect}") \ unless model.to_s.match?(MODEL_ID_PATTERN) "/model/#{model}/invoke" end |
#parse_embedding_response(response, model:, text:) ⇒ RubyLLM::Embedding
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
# File 'lib/legion/llm/call/bedrock_embeddings.rb', line 105 def (response, model:, text:) body = response.body body = try_parse_json(body) if body.is_a?(String) vectors = if model.to_s.start_with?(COHERE_PREFIX) Array(body['embeddings']) else # Titan single-text response: the single vector lives in :embedding. # Batch callers are handled in `embed` via iteration. [body['embedding']].compact end raise RubyLLM::Error.new(response, "Empty embedding response for model #{model}") if vectors.empty? vectors = vectors.first if vectors.length == 1 && !text.is_a?(Array) input_tokens = body['inputTextTokenCount'] || body.dig('meta', 'billed_units', 'input_tokens') || 0 RubyLLM::Embedding.new(vectors: vectors, model: model, input_tokens: input_tokens) end |
#render_embedding_payload(text, model:, dimensions:) ⇒ Hash
Returns JSON-serializable request payload.
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
# File 'lib/legion/llm/call/bedrock_embeddings.rb', line 81 def (text, model:, dimensions:) model_str = model.to_s if model_str.start_with?(TITAN_V2_PREFIX) titan_v2_payload(text, dimensions: dimensions) elsif model_str.start_with?(TITAN_V1_PREFIX) titan_v1_payload(text) elsif model_str.start_with?(COHERE_PREFIX) cohere_payload(text) else raise RubyLLM::Error.new( nil, "Bedrock model '#{model}' is not supported for embeddings. " \ 'Supported prefixes: amazon.titan-embed-text-v1, ' \ 'amazon.titan-embed-text-v2, cohere.embed-*.' ) end end |