Module: Parse::Embeddings
- Defined in:
- lib/parse/embeddings.rb,
lib/parse/embeddings.rb,
lib/parse/embeddings/jina.rb,
lib/parse/embeddings/qwen.rb,
lib/parse/embeddings/cohere.rb,
lib/parse/embeddings/openai.rb,
lib/parse/embeddings/voyage.rb,
lib/parse/embeddings/fixture.rb,
lib/parse/embeddings/provider.rb,
lib/parse/embeddings/local_http.rb
Overview
Pluggable embedding-provider registry for ‘:vector` properties and the upcoming `find_similar(text:)` / `Parse::Retrieval.retrieve` surfaces.
Text-only providers shipped:
-
Fixture — deterministic, zero-network. Auto-registered as ‘:fixture` so tests can call `Parse::Embeddings.provider(:fixture)` with no setup.
-
OpenAI — text-embedding-3-small,large and ada-002.
-
Cohere — embed-english,multilingual-v3.0 and ‘*-light-v3.0`. Distinguishes `:search_query` / `:search_document` at the wire.
-
Voyage — voyage-4 family (incl. open-weight ‘voyage-4-nano`), voyage-3 family, voyage-code-3, voyage-finance-2, voyage-law-2. Distinguishes input types.
-
Jina — jina-embeddings-v3/v4/v5 (text + omni-text mode), jina-code-embeddings-00.5b,10.5b,1.5b. Matryoshka via ‘dimensions:`.
-
Qwen — qwen3-embedding-00.6b,4b,8b via Alibaba Cloud DashScope compatible-mode. All Matryoshka. The same checkpoints are open-weight on Hugging Face (Apache 2.0) for self-hosting behind LocalHTTP.
-
LocalHTTP — generic OpenAI-compatible client for Ollama, LM Studio, vLLM, etc. Configure-time SSRF gate; requires ‘allow_private_endpoint: true` to talk to localhost.
Image / multimodal embedding (‘embed_image`) is a forthcoming feature — the Provider#embed_image hook is defined but only the multimodal-capable providers will override it.
Registration
Two equivalent forms. Embeddings.register is the canonical one-liner and what every example in the gem uses; Embeddings.configure is the block form for registering several providers at once or for Rails-style initializers. Both end up at the same ProviderRegistry, so pick whichever reads better in context.
Defined Under Namespace
Classes: Cohere, Configuration, Error, Fixture, InvalidResponseError, Jina, LocalHTTP, OpenAI, Provider, ProviderNotRegistered, ProviderRegistry, Qwen, Voyage
Constant Summary collapse
- CONFIG_MUTEX =
Monitor guarding configuration memoization and register writes. MRI’s GVL would normally absorb the race on ‘@configuration ||= …`, but JRuby and TruffleRuby can produce two `Configuration` instances when two threads race at boot (and lose any provider written to the loser). A Monitor (rather than a Mutex) is used so that `register` — which holds the lock and then calls `configuration` — can re-enter without deadlocking on the first-touch allocation path.
Monitor.new
Class Method Summary collapse
-
.configuration ⇒ Configuration
The singleton configuration object.
-
.configure {|config| ... } ⇒ Configuration
Block form for registering multiple providers at once.
-
.provider(name) ⇒ Provider
Look up a registered provider.
-
.register(name, provider) ⇒ Provider
Canonical one-liner: register a single provider under ‘name`.
-
.registered_provider_names ⇒ Array<Symbol>
Names of currently-registered providers (does NOT include the implicit ‘:fixture` fallback unless it’s been instantiated).
-
.reset! ⇒ void
Reset the entire registry — intended for test teardown only.
Class Method Details
.configuration ⇒ Configuration
Returns the singleton configuration object.
137 138 139 140 141 142 |
# File 'lib/parse/embeddings.rb', line 137 def configuration # Double-checked memoization. The fast path is a single ivar # read; the slow path enters the mutex only when the # configuration is unallocated. @configuration || CONFIG_MUTEX.synchronize { @configuration ||= Configuration.new } end |
.configure {|config| ... } ⇒ Configuration
Block form for registering multiple providers at once. Prefer the one-liner register when adding a single provider; this form pays off when an initializer needs to set several or to mutate the registry conditionally.
131 132 133 134 |
# File 'lib/parse/embeddings.rb', line 131 def configure yield configuration if block_given? configuration end |
.provider(name) ⇒ Provider
Look up a registered provider.
**Zero-config fallback:** ‘:fixture` returns a default Fixture instance (64-dim, deterministic) when nothing is registered. Every other name raises ProviderNotRegistered. Tests can rely on `provider(:fixture)` working out of the box; production code must register what it uses.
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 |
# File 'lib/parse/embeddings.rb', line 173 def provider(name) # Avoid blindly `to_sym`-ing the caller's input. An LLM tool or # webhook handler that pipes its `name:` argument through here # would otherwise let a remote caller grow the symbol table at # will. Ruby 3.2+ GCs symbols so the practical impact is small, # but a string-matched lookup costs nothing and closes the gap. if name.is_a?(Symbol) return configuration.providers[name] if configuration.providers.key?(name) key_string = name.to_s else key_string = name.to_s found = configuration.providers.keys.find { |k| k.to_s == key_string } return configuration.providers[found] if found end if key_string == "fixture" CONFIG_MUTEX.synchronize do return configuration.providers[:fixture] ||= Fixture.new end end raise ProviderNotRegistered, "Parse::Embeddings.provider(#{name.inspect}): no provider registered. " \ "Register one via Parse::Embeddings.register(#{name.inspect}, …)." end |
.register(name, provider) ⇒ Provider
Canonical one-liner: register a single provider under ‘name`. Overwrites any previous registration. Use configure for multi-provider blocks.
151 152 153 154 155 156 157 158 159 160 |
# File 'lib/parse/embeddings.rb', line 151 def register(name, provider) unless provider.is_a?(Provider) raise ArgumentError, "Parse::Embeddings.register: #{name.inspect} expects a Parse::Embeddings::Provider " \ "instance (got #{provider.class})." end CONFIG_MUTEX.synchronize do configuration.providers[name.to_sym] = provider end end |
.registered_provider_names ⇒ Array<Symbol>
Names of currently-registered providers (does NOT include the implicit ‘:fixture` fallback unless it’s been instantiated).
201 202 203 |
# File 'lib/parse/embeddings.rb', line 201 def registered_provider_names configuration.providers.keys end |
.reset! ⇒ void
This method returns an undefined value.
Reset the entire registry — intended for test teardown only. Production code should never call this; use register to override a single provider.
210 211 212 |
# File 'lib/parse/embeddings.rb', line 210 def reset! CONFIG_MUTEX.synchronize { @configuration = nil } end |