Class: Woods::Embedding::Provider::Ollama

Inherits:

Object

Object
Woods::Embedding::Provider::Ollama

show all

Includes:: Interface

Defined in:: lib/woods/embedding/provider.rb

Overview

Ollama adapter for local embeddings via the Ollama HTTP API.

Uses the ‘/api/embed` endpoint to generate embeddings. Requires a running Ollama instance (default: localhost:11434) with the specified model pulled.

Examples:

provider = Woods::Embedding::Provider::Ollama.new
vector = provider.embed("class User < ApplicationRecord; end")
vectors = provider.embed_batch(["text1", "text2"])

Constant Summary collapse

DEFAULT_MODEL =

'nomic-embed-text'

DEFAULT_HOST =

'http://localhost:11434'

MODEL_CONTEXT_LENGTHS = Ollama enforces the model’s native context length on ‘/api/embed` regardless of the `num_ctx` override — we’ve validated this against 0.15.x for nomic-embed-text (rejects >2048) and bge-m3 (accepts up to 8192, silently truncates above). Advertise the native ceiling so the chunker can size inputs correctly. Models outside this registry fall back to Ollama’s conservative 2048 default. See ‘docs/EMBEDDING_MODELS.md` for the tradeoff matrix and instructions for adding a new model here.

{
  'nomic-embed-text' => 2048,
  'bge-m3' => 8192,
  'mxbai-embed-large' => 512,
  'snowflake-arctic-embed' => 512,
  'snowflake-arctic-embed2' => 8192,
  # all-minilm: 512 is the model's context length, NOT the 384
  # embedding dimension and NOT the 256 some sources confuse with
  # the dimension. With a 256-token budget the chunker formula
  # produces a negative max_chars and silently drops every chunk.
  'all-minilm' => 512
}.freeze

FALLBACK_NUM_CTX = Fallback when the configured model isn’t in the registry.

DEFAULT_READ_TIMEOUT = Default read timeout for /api/embed. The previous 30s default was too short for batched embed calls on cold models — Ollama has to load the model on first call, and an N-item batch can easily exceed 30s on a CPU-only host. 120s leaves headroom without wedging the whole pipeline on a genuinely dead server.

Instance Method Summary collapse

#dimensions ⇒ Integer

Return the dimensionality of vectors produced by this model.
#embed(text) ⇒ Array<Float>

Embed a single text string.
#embed_batch(texts) ⇒ Array<Array<Float>>

Embed multiple texts in a single request.
#initialize(model: DEFAULT_MODEL, host: DEFAULT_HOST, num_ctx: nil, read_timeout: DEFAULT_READ_TIMEOUT) ⇒ Ollama constructor

A new instance of Ollama.
#max_input_tokens ⇒ Integer

Maximum input length Ollama will accept — tracks the configured context window.
#model_name ⇒ String

Return the model name.

Constructor Details

#initialize(model: DEFAULT_MODEL, host: DEFAULT_HOST, num_ctx: nil, read_timeout: DEFAULT_READ_TIMEOUT) ⇒ `Ollama`

Returns a new instance of Ollama.

Parameters:

model (String) (defaults to: DEFAULT_MODEL) —

Ollama model name (default: nomic-embed-text). Set to ‘“bge-m3”` or `“snowflake-arctic-embed2”` for an 8192-token context and skip most chunking for dense Rails units.
host (String) (defaults to: DEFAULT_HOST) —

Ollama server URL (default: localhost:11434)
num_ctx (Integer, nil) (defaults to: nil) —

Ollama context window in tokens. When ‘nil` (the default), the provider picks the model’s native context from ‘MODEL_CONTEXT_LENGTHS`, falling back to 2048 for unknown models. Set explicitly only if running a model with a known-larger native context that isn’t in the registry yet.
read_timeout (Integer) (defaults to: DEFAULT_READ_TIMEOUT) —

HTTP read timeout in seconds. Bump this for slow / cold-start hosts or very large batches.

# File 'lib/woods/embedding/provider.rb', line 123

def initialize(model: DEFAULT_MODEL, host: DEFAULT_HOST, num_ctx: nil,
               read_timeout: DEFAULT_READ_TIMEOUT)
  @model = model
  @host = host
  @num_ctx = num_ctx || MODEL_CONTEXT_LENGTHS.fetch(model, FALLBACK_NUM_CTX)
  @read_timeout = read_timeout
  @uri = URI("#{host}/api/embed")
end

Instance Method Details

#dimensions ⇒ `Integer`

Return the dimensionality of vectors produced by this model.

Determined dynamically by embedding a test string on first call.

Returns:

(Integer) —

number of dimensions



166
167
168

# File 'lib/woods/embedding/provider.rb', line 166

def dimensions
  @dimensions ||= embed('test').length
end

#embed(text) ⇒ `Array<Float>`

Embed a single text string.

Parameters:

text (String) —

the text to embed

Returns:

(Array<Float>) —

the embedding vector

Raises:

(Woods::Error) —

if the API returns an error
(ArgumentError) —

if the text is nil or empty (avoids provider 400)

# File 'lib/woods/embedding/provider.rb', line 138

def embed(text)
  raise ArgumentError, 'embed(text) requires a non-empty string' if text.nil? || text.to_s.strip.empty?

  response = post_request(build_body(text))
  response['embeddings'].first
end

#embed_batch(texts) ⇒ `Array<Array<Float>>`

Embed multiple texts in a single request.

Parameters:

texts (Array<String>) —

the texts to embed

Returns:

(Array<Array<Float>>) —

array of embedding vectors

Raises:

(Woods::Error) —

if the API returns an error
(ArgumentError) —

if the array is empty or any element is nil/empty

# File 'lib/woods/embedding/provider.rb', line 151

def embed_batch(texts)
  raise ArgumentError, 'embed_batch(texts) requires a non-empty array' if texts.nil? || texts.empty?
  if texts.any? { |t| t.nil? || t.to_s.strip.empty? }
    raise ArgumentError, 'embed_batch(texts) rejects nil/empty entries'
  end

  response = post_request(build_body(texts))
  response['embeddings']
end

#max_input_tokens ⇒ `Integer`

Maximum input length Ollama will accept — tracks the configured context window. Always populated: the constructor resolves ‘num_ctx` to the model’s registry entry or FALLBACK_NUM_CTX, so this method never returns nil for an Ollama provider.

Returns:

(Integer)



183
184
185

# File 'lib/woods/embedding/provider.rb', line 183

def max_input_tokens
  @num_ctx
end

#model_name ⇒ `String`

Return the model name.

Returns:

(String) —

the Ollama model name



173
174
175

# File 'lib/woods/embedding/provider.rb', line 173

def model_name
  @model
end

Class: Woods::Embedding::Provider::Ollama

Overview

Examples:

Constant Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(model: DEFAULT_MODEL, host: DEFAULT_HOST, num_ctx: nil, read_timeout: DEFAULT_READ_TIMEOUT) ⇒ Ollama

Instance Method Details

#dimensions ⇒ Integer

#embed(text) ⇒ Array<Float>

#embed_batch(texts) ⇒ Array<Array<Float>>

#max_input_tokens ⇒ Integer

#model_name ⇒ String

#initialize(model: DEFAULT_MODEL, host: DEFAULT_HOST, num_ctx: nil, read_timeout: DEFAULT_READ_TIMEOUT) ⇒ `Ollama`

#dimensions ⇒ `Integer`

#embed(text) ⇒ `Array<Float>`

#embed_batch(texts) ⇒ `Array<Array<Float>>`

#max_input_tokens ⇒ `Integer`

#model_name ⇒ `String`