Class: Parse::Embeddings::Cohere

Inherits:
Provider
  • Object
show all
Defined in:
lib/parse/embeddings/cohere.rb

Overview

Cohere embeddings provider. Wraps ‘POST /v1/embed`.

Supported models:

  • v4 — ‘embed-v4.0` (1536 native, Matryoshka 512, 1024, 1536, 128k-token context). Unified text + image model at the network boundary; this provider exposes the text-input path only — image inputs will land in v5.1 alongside the Provider#embed_image hook.

  • v3 — ‘embed-english-v3.0`, `embed-multilingual-v3.0` (both 1024-dim), `embed-english-light-v3.0`, `embed-multilingual-light-v3.0` (both 384-dim). Text-only.

Asymmetric input types

Cohere is one of the providers that DOES distinguish queries from documents at the wire level via the ‘input_type` request field. Sending `input_type: “search_query”` for a query and `“search_document”` for a corpus item is required for good recall on Cohere’s v3 models — using the same type for both halves of a retrieval pair degrades nDCG by a noticeable margin (Cohere’s own benchmarks). ‘Provider#supports_input_type?` returns `true` here so callers / cache-keying middleware can branch on this.

The accepted Symbol values map to the Cohere wire strings:

  • ‘:search_query` → `“search_query”`

  • ‘:search_document` → `“search_document”`

  • ‘:classification` → `“classification”`

  • ‘:clustering` → `“clustering”`

Security

  • The Faraday connection refuses ‘proxy:` unless the caller opts in via `allow_faraday_proxy: true`. Env-proxy autodiscovery (`HTTPS_PROXY` etc.) is suppressed by default — same model as `Parse::Client` and OpenAI.

  • ‘#inspect` (inherited from Provider) never surfaces `@api_key`.

  • ‘Authorization` and `Cohere-Api-Key` are in Middleware::BodyBuilder::REDACTED_HEADERS.

Examples:

registration

Parse::Embeddings.register(:cohere,
  Parse::Embeddings::Cohere.new(
    api_key: ENV.fetch("COHERE_API_KEY"),
    model:   "embed-english-v3.0",
  ))

Defined Under Namespace

Classes: AuthenticationError, BadRequestError, RateLimitError, TransientError

Constant Summary collapse

DEFAULT_BASE_URL =
"https://api.cohere.com/v1"
DEFAULT_MODEL =
"embed-english-v3.0"
DEFAULT_TIMEOUT =
30
DEFAULT_OPEN_TIMEOUT =
5
DEFAULT_MAX_RETRIES =
3
DEFAULT_BATCH_SIZE =

Cohere documents a hard cap of 96 inputs per ‘/embed` call.

96
MAX_RESPONSE_BYTES =
16 * 1024 * 1024
MODEL_DEFAULT_DIMENSIONS =
{
  "embed-v4.0"                     => 1536,
  "embed-english-v3.0"             => 1024,
  "embed-multilingual-v3.0"        => 1024,
  "embed-english-light-v3.0"       => 384,
  "embed-multilingual-light-v3.0"  => 384,
}.freeze
MODEL_MAX_INPUT_TOKENS =
{
  "embed-v4.0"                     => 128_000,
  "embed-english-v3.0"             => 512,
  "embed-multilingual-v3.0"        => 512,
  "embed-english-light-v3.0"       => 512,
  "embed-multilingual-light-v3.0"  => 512,
}.freeze
MATRYOSHKA_MODELS =

Models that accept Cohere’s ‘output_dimension` Matryoshka truncation parameter. v4.0 is the only such row today; v3 models reject the field with a 400.

%w[embed-v4.0].freeze
MATRYOSHKA_WIDTHS =

Allowed Matryoshka widths per model (Cohere quantizes the available truncations rather than accepting any integer ≤native). Empty allowlist = any integer ≤ native is fine, but for v4.0 Cohere documents exactly these four widths.

{
  "embed-v4.0" => [256, 512, 1024, 1536].freeze,
}.freeze
INPUT_TYPE_WIRE_VALUES =

Map SDK-canonical input_type symbols to Cohere wire strings. Symbols outside this set raise — silently downgrading ‘:unknown_type` to `“search_document”` would mask cache-key bugs in higher layers (the value participates in cache keys).

{
  search_query:    "search_query",
  search_document: "search_document",
  classification:  "classification",
  clustering:      "clustering",
}.freeze

Constants inherited from Provider

Provider::AS_NOTIFICATION_NAME

Instance Method Summary collapse

Methods inherited from Provider

#embed_image, #embed_text_batched, #inspect, #instrument_embed, #modalities, #validate_response!

Constructor Details

#initialize(api_key:, model: DEFAULT_MODEL, dimensions: nil, base_url: DEFAULT_BASE_URL, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_faraday_proxy: false, allow_insecure_base_url: false, connection: nil) ⇒ Cohere

Returns a new instance of Cohere.

Parameters:

  • api_key (String)

    required. Sent as ‘Authorization: Bearer …`.

  • model (String) (defaults to: DEFAULT_MODEL)

    one of MODEL_DEFAULT_DIMENSIONS‘s keys.

  • base_url (String) (defaults to: DEFAULT_BASE_URL)

    override. Must be HTTPS unless ‘allow_insecure_base_url: true`.

  • timeout (Integer) (defaults to: DEFAULT_TIMEOUT)

    read timeout, seconds.

  • open_timeout (Integer) (defaults to: DEFAULT_OPEN_TIMEOUT)

    connect timeout, seconds.

  • max_retries (Integer) (defaults to: DEFAULT_MAX_RETRIES)

    retry attempts on 429/5xx/timeouts.

  • embed_batch_size (Integer) (defaults to: DEFAULT_BATCH_SIZE)

    inputs per request (max 96).

  • allow_faraday_proxy (Boolean) (defaults to: false)

    opt in to proxy / env-proxy autodiscovery. Defaults ‘false`.

  • allow_insecure_base_url (Boolean) (defaults to: false)

    permit ‘http://` base (local proxies). Defaults `false`.

  • connection (Faraday::Connection, nil) (defaults to: nil)

    injection seam for tests.



130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
# File 'lib/parse/embeddings/cohere.rb', line 130

def initialize(
  api_key:,
  model: DEFAULT_MODEL,
  dimensions: nil,
  base_url: DEFAULT_BASE_URL,
  timeout: DEFAULT_TIMEOUT,
  open_timeout: DEFAULT_OPEN_TIMEOUT,
  max_retries: DEFAULT_MAX_RETRIES,
  embed_batch_size: DEFAULT_BATCH_SIZE,
  allow_faraday_proxy: false,
  allow_insecure_base_url: false,
  connection: nil
)
  validate_api_key!(api_key)
  validate_model!(model)
  validate_dimensions!(model, dimensions)
  sanitized_base_url = validate_base_url!(base_url, allow_insecure_base_url)
  validate_positive_integer!(:timeout, timeout)
  validate_positive_integer!(:open_timeout, open_timeout)
  validate_non_negative_integer!(:max_retries, max_retries)
  validate_positive_integer!(:embed_batch_size, embed_batch_size)
  if embed_batch_size > 96
    raise ArgumentError,
          "Parse::Embeddings::Cohere: embed_batch_size #{embed_batch_size} exceeds Cohere's per-request cap (96)."
  end

  @api_key = api_key
  @model = model
  @dimensions = dimensions || MODEL_DEFAULT_DIMENSIONS.fetch(model)
  @base_url = sanitized_base_url
  @timeout = timeout
  @open_timeout = open_timeout
  @max_retries = max_retries
  @embed_batch_size = embed_batch_size
  @allow_faraday_proxy = allow_faraday_proxy
  @connection = connection || build_connection
end

Instance Method Details

#dimensionsObject



168
169
170
# File 'lib/parse/embeddings/cohere.rb', line 168

def dimensions
  @dimensions
end

#embed_batch_sizeObject



176
177
178
# File 'lib/parse/embeddings/cohere.rb', line 176

def embed_batch_size
  @embed_batch_size
end

#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>

Returns vectors aligned 1:1 with ‘strings`.

Parameters:

Returns:

  • (Array<Array<Float>>)

    vectors aligned 1:1 with ‘strings`.



196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
# File 'lib/parse/embeddings/cohere.rb', line 196

def embed_text(strings, input_type: :search_document)
  unless strings.is_a?(Array)
    raise ArgumentError,
          "Parse::Embeddings::Cohere#embed_text expects Array<String> (got #{strings.class})."
  end
  return [] if strings.empty?
  strings.each_with_index do |s, i|
    unless s.is_a?(String)
      raise ArgumentError,
            "Parse::Embeddings::Cohere#embed_text strings[#{i}] is not a String (#{s.class})."
    end
    if s.empty?
      raise ArgumentError,
            "Parse::Embeddings::Cohere#embed_text strings[#{i}] is empty; Cohere rejects empty inputs."
    end
  end
  wire_input_type = INPUT_TYPE_WIRE_VALUES[input_type]
  unless wire_input_type
    raise ArgumentError,
          "Parse::Embeddings::Cohere#embed_text input_type #{input_type.inspect} not in " \
          "#{INPUT_TYPE_WIRE_VALUES.keys.inspect}."
  end

  body = {
    texts: strings,
    model: @model,
    input_type: wire_input_type,
    embedding_types: ["float"],
  }
  # Forward `output_dimension` only for Matryoshka-capable models
  # whose active width differs from native. Sending it to a v3
  # row would yield a 400 from Cohere.
  if MATRYOSHKA_MODELS.include?(@model) &&
     @dimensions != MODEL_DEFAULT_DIMENSIONS.fetch(@model)
    body[:output_dimension] = @dimensions
  end

  instrument_embed(strings.length, input_type) do |emit_payload|
    payload = post_embeddings(body)
    # Cohere's response carries `meta.billed_units.input_tokens`
    # (and `output_tokens`, though for embeddings it's 0). Forward
    # input_tokens as the operator-facing cost number on the AS::N
    # payload so cost subscribers can budget across providers.
    if payload.is_a?(Hash) && payload["meta"].is_a?(Hash) &&
       payload["meta"]["billed_units"].is_a?(Hash)
      tt = payload["meta"]["billed_units"]["input_tokens"]
      emit_payload[:total_tokens] = tt if tt.is_a?(Integer) && tt >= 0
    end
    vectors = extract_vectors!(payload, strings.length)
    validate_response!(strings.length, vectors)
  end
end

#inspect_attrsObject



249
250
251
# File 'lib/parse/embeddings/cohere.rb', line 249

def inspect_attrs
  super.merge(base: safe_base_host, retries: @max_retries)
end

#max_input_tokensObject



180
181
182
# File 'lib/parse/embeddings/cohere.rb', line 180

def max_input_tokens
  MODEL_MAX_INPUT_TOKENS[@model]
end

#model_nameObject



172
173
174
# File 'lib/parse/embeddings/cohere.rb', line 172

def model_name
  @model
end

#normalize?Boolean

Returns:

  • (Boolean)


184
185
186
187
# File 'lib/parse/embeddings/cohere.rb', line 184

def normalize?
  # Cohere v3 embeddings are documented unit-normalized.
  true
end

#supports_input_type?Boolean

Returns:

  • (Boolean)


189
190
191
# File 'lib/parse/embeddings/cohere.rb', line 189

def supports_input_type?
  true
end