Class: Woods::Embedding::Provider::Ollama
- Inherits:
-
Object
- Object
- Woods::Embedding::Provider::Ollama
- Includes:
- Interface
- Defined in:
- lib/woods/embedding/provider.rb
Overview
Ollama adapter for local embeddings via the Ollama HTTP API.
Uses the ‘/api/embed` endpoint to generate embeddings. Requires a running Ollama instance (default: localhost:11434) with the specified model pulled.
Constant Summary collapse
- DEFAULT_MODEL =
'nomic-embed-text'- DEFAULT_HOST =
'http://localhost:11434'- MODEL_CONTEXT_LENGTHS =
Ollama enforces the model’s native context length on ‘/api/embed` regardless of the `num_ctx` override — we’ve validated this against 0.15.x for nomic-embed-text (rejects >2048) and bge-m3 (accepts up to 8192, silently truncates above). Advertise the native ceiling so the chunker can size inputs correctly. Models outside this registry fall back to Ollama’s conservative 2048 default.
See ‘docs/EMBEDDING_MODELS.md` for the tradeoff matrix and instructions for adding a new model here.
{ 'nomic-embed-text' => 2048, 'bge-m3' => 8192, 'mxbai-embed-large' => 512, 'snowflake-arctic-embed' => 512, 'snowflake-arctic-embed2' => 8192, # all-minilm: 512 is the model's context length, NOT the 384 # embedding dimension and NOT the 256 some sources confuse with # the dimension. With a 256-token budget the chunker formula # produces a negative max_chars and silently drops every chunk. 'all-minilm' => 512 }.freeze
- FALLBACK_NUM_CTX =
Fallback when the configured model isn’t in the registry.
2048- DEFAULT_READ_TIMEOUT =
Default read timeout for /api/embed. The previous 30s default was too short for batched embed calls on cold models — Ollama has to load the model on first call, and an N-item batch can easily exceed 30s on a CPU-only host. 120s leaves headroom without wedging the whole pipeline on a genuinely dead server.
120
Instance Method Summary collapse
-
#dimensions ⇒ Integer
Return the dimensionality of vectors produced by this model.
-
#embed(text) ⇒ Array<Float>
Embed a single text string.
-
#embed_batch(texts) ⇒ Array<Array<Float>>
Embed multiple texts in a single request.
-
#initialize(model: DEFAULT_MODEL, host: DEFAULT_HOST, num_ctx: nil, read_timeout: DEFAULT_READ_TIMEOUT) ⇒ Ollama
constructor
A new instance of Ollama.
-
#max_input_tokens ⇒ Integer
Maximum input length Ollama will accept — tracks the configured context window.
-
#model_name ⇒ String
Return the model name.
Constructor Details
#initialize(model: DEFAULT_MODEL, host: DEFAULT_HOST, num_ctx: nil, read_timeout: DEFAULT_READ_TIMEOUT) ⇒ Ollama
Returns a new instance of Ollama.
123 124 125 126 127 128 129 130 |
# File 'lib/woods/embedding/provider.rb', line 123 def initialize(model: DEFAULT_MODEL, host: DEFAULT_HOST, num_ctx: nil, read_timeout: DEFAULT_READ_TIMEOUT) @model = model @host = host @num_ctx = num_ctx || MODEL_CONTEXT_LENGTHS.fetch(model, FALLBACK_NUM_CTX) @read_timeout = read_timeout @uri = URI("#{host}/api/embed") end |
Instance Method Details
#dimensions ⇒ Integer
Return the dimensionality of vectors produced by this model.
Determined dynamically by embedding a test string on first call.
166 167 168 |
# File 'lib/woods/embedding/provider.rb', line 166 def dimensions @dimensions ||= ('test').length end |
#embed(text) ⇒ Array<Float>
Embed a single text string.
138 139 140 141 142 143 |
# File 'lib/woods/embedding/provider.rb', line 138 def (text) raise ArgumentError, 'embed(text) requires a non-empty string' if text.nil? || text.to_s.strip.empty? response = post_request(build_body(text)) response['embeddings'].first end |
#embed_batch(texts) ⇒ Array<Array<Float>>
Embed multiple texts in a single request.
151 152 153 154 155 156 157 158 159 |
# File 'lib/woods/embedding/provider.rb', line 151 def (texts) raise ArgumentError, 'embed_batch(texts) requires a non-empty array' if texts.nil? || texts.empty? if texts.any? { |t| t.nil? || t.to_s.strip.empty? } raise ArgumentError, 'embed_batch(texts) rejects nil/empty entries' end response = post_request(build_body(texts)) response['embeddings'] end |
#max_input_tokens ⇒ Integer
Maximum input length Ollama will accept — tracks the configured context window. Always populated: the constructor resolves ‘num_ctx` to the model’s registry entry or FALLBACK_NUM_CTX, so this method never returns nil for an Ollama provider.
183 184 185 |
# File 'lib/woods/embedding/provider.rb', line 183 def max_input_tokens @num_ctx end |
#model_name ⇒ String
Return the model name.
173 174 175 |
# File 'lib/woods/embedding/provider.rb', line 173 def model_name @model end |