Class: Woods::Cache::CachedEmbeddingProvider
- Inherits:
-
Object
- Object
- Woods::Cache::CachedEmbeddingProvider
- Includes:
- Embedding::Provider::Interface
- Defined in:
- lib/woods/cache/cache_middleware.rb
Overview
Decorator that wraps an embedding provider with cache-through logic.
Implements the same Embedding::Provider::Interface so it can be injected transparently in place of the real provider. On cache hit, the expensive API call (OpenAI, Ollama) is skipped entirely.
Instance Method Summary collapse
-
#dimensions ⇒ Integer
Delegate dimensions to the underlying provider.
-
#embed(text) ⇒ Array<Float>
Embed a single text, returning a cached vector when available.
-
#embed_batch(texts) ⇒ Array<Array<Float>>
Embed a batch of texts, using cached vectors for any previously seen texts.
-
#initialize(provider:, cache_store:, ttl: DEFAULT_TTLS[:embeddings]) ⇒ CachedEmbeddingProvider
constructor
A new instance of CachedEmbeddingProvider.
-
#max_input_tokens ⇒ Integer?
Delegate the per-provider input cap so Builder’s chunker / text preparer wiring keeps working when the cache wrapper is in front of the provider.
-
#model_name ⇒ String
Delegate model_name to the underlying provider.
Constructor Details
#initialize(provider:, cache_store:, ttl: DEFAULT_TTLS[:embeddings]) ⇒ CachedEmbeddingProvider
Returns a new instance of CachedEmbeddingProvider.
109 110 111 112 113 114 115 |
# File 'lib/woods/cache/cache_middleware.rb', line 109 def initialize(provider:, cache_store:, ttl: DEFAULT_TTLS[:embeddings]) @provider = provider @cache_store = cache_store @ttl = ttl @inflight = {} @inflight_mutex = Mutex.new end |
Instance Method Details
#dimensions ⇒ Integer
Delegate dimensions to the underlying provider.
158 159 160 |
# File 'lib/woods/cache/cache_middleware.rb', line 158 def dimensions @provider.dimensions end |
#embed(text) ⇒ Array<Float>
Embed a single text, returning a cached vector when available.
Shares the per-text single-flight map with #embed_batch, so concurrent ‘embed(“x”)` / `embed_batch([“x”, …])` misses for the same text all attach to the same in-flight entry and produce exactly one provider call.
125 126 127 128 129 130 |
# File 'lib/woods/cache/cache_middleware.rb', line 125 def (text) cached = @cache_store.read((text)) return cached unless cached.nil? with_single_flight(text) { @provider.(text) } end |
#embed_batch(texts) ⇒ Array<Array<Float>>
Embed a batch of texts, using cached vectors for any previously seen texts.
Only texts that are not already cached are sent to the real provider. Results are merged back in original order.
Uses per-text single-flight to prevent cache-miss stampedes: when N threads concurrently miss on the same text, exactly one calls the provider while the others attach to its InflightEntry and wait. See issue #88.
143 144 145 146 147 148 149 150 151 152 153 |
# File 'lib/woods/cache/cache_middleware.rb', line 143 def (texts) results, misses, miss_indices = partition_cached(texts) return results if misses.empty? to_fetch, to_fetch_positions, our_entries, awaiting = claim_inflight(misses) fetch_and_fulfill(to_fetch, to_fetch_positions, our_entries, results, miss_indices) await_others(awaiting, results, miss_indices) results end |
#max_input_tokens ⇒ Integer?
Delegate the per-provider input cap so Builder’s chunker / text preparer wiring keeps working when the cache wrapper is in front of the provider. Without this, ‘respond_to?(:max_input_tokens)` returns true (inherited from Interface) but the call raises NotImplementedError.
176 177 178 179 180 |
# File 'lib/woods/cache/cache_middleware.rb', line 176 def max_input_tokens return @provider.max_input_tokens if @provider.respond_to?(:max_input_tokens) nil end |
#model_name ⇒ String
Delegate model_name to the underlying provider.
165 166 167 |
# File 'lib/woods/cache/cache_middleware.rb', line 165 def model_name @provider.model_name end |