Class: Parse::Embeddings::Cache::MonetaStore

Inherits:
Object
  • Object
show all
Defined in:
lib/parse/embeddings/cache.rb

Overview

Adapter exposing any Moneta-compatible key/value store ([] / []=, optionally store(key, value, expires:)) through the get/set duck enable! expects — the persistent-L2 option. Point it at the same Redis your Parse.cache uses and query-embed cache entries survive process restarts and are shared across processes:

require "moneta" moneta = Moneta.new(:Redis, url: ENV["REDIS_URL"], value_serializer: nil) Parse::Embeddings::Cache.enable!( store: Parse::Embeddings::Cache::MonetaStore.new(moneta, ttl: 30 * 24 * 3600), )

Keys are namespaced (emb: by default) so the entries are recognizable next to other application keys; values are JSON-encoded vector Arrays (see #get/#set).

SECURITY — build the Moneta store with value_serializer: nil (as above). Moneta's default value serializer is Marshal, so a cache read would Marshal.load whatever bytes are in the backing store — an arbitrary-code-execution primitive if that store is shared, unauthenticated, or reachable over a plaintext redis:// MITM, and the cache key is derived from (often user-supplied) embedded text. MonetaStore JSON-(de)serializes values itself, but that only closes the vector IF Moneta is not also Marshaling on top; value_serializer: nil ensures it is not. MonetaStore emits a one-time warning if it is handed a Marshal-serializing store. TTL is forwarded via Moneta's expires: option when the backend supports it, ignored otherwise.

Fail-open by design: a backend error (Redis down, serialization hiccup) degrades to a cache miss / dropped write — the embed path must never fail because the CACHE is unhealthy.

The cross-process race the in-process LRU doesn't have applies here: two processes missing the same key concurrently both call the provider and both write. That is correct (embeddings are deterministic per key) and bounded — no locking is attempted.

Instance Method Summary collapse

Constructor Details

#initialize(moneta, ttl: nil, namespace: "emb:") ⇒ MonetaStore

Returns a new instance of MonetaStore.

Parameters:

  • moneta (#[], #[]=)

    a Moneta store (or anything with the same indexing duck).

  • ttl (Numeric, nil) (defaults to: nil)

    per-entry lifetime in seconds, forwarded as expires: when the backend supports store(key, value, expires:). nil = no expiry.

  • namespace (String) (defaults to: "emb:")

    key prefix.



130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
# File 'lib/parse/embeddings/cache.rb', line 130

def initialize(moneta, ttl: nil, namespace: "emb:")
  unless moneta.respond_to?(:[]) && moneta.respond_to?(:[]=)
    raise ArgumentError,
          "Parse::Embeddings::Cache::MonetaStore expects a Moneta-compatible " \
          "store responding to #[] and #[]= (got #{moneta.class})."
  end
  if marshaling_value_store?(moneta)
    warn "[Parse::Embeddings::Cache::MonetaStore] SECURITY: the supplied Moneta " \
         "store deserializes values with Marshal. A cache read Marshal.loads bytes " \
         "from the backing store, which is a remote-code-execution vector when the " \
         "store is shared/untrusted. Rebuild it with value_serializer: nil, e.g. " \
         "Moneta.new(:Redis, url: ..., value_serializer: nil)."
  end
  @moneta = moneta
  @ttl = ttl && Float(ttl)
  @namespace = namespace.to_s
end

Instance Method Details

#get(key) ⇒ Array<Float>?

Returns:



149
150
151
152
153
# File 'lib/parse/embeddings/cache.rb', line 149

def get(key)
  decode_vector(@moneta[@namespace + key])
rescue StandardError
  nil
end

#set(key, vector) ⇒ Array<Float>

Returns the vector, unchanged.

Returns:

  • (Array<Float>)

    the vector, unchanged.



156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
# File 'lib/parse/embeddings/cache.rb', line 156

def set(key, vector)
  k = @namespace + key
  encoded = encode_vector(vector)
  if @ttl && @moneta.respond_to?(:store)
    begin
      @moneta.store(k, encoded, expires: @ttl)
    rescue ArgumentError
      # Hash-like backends define #store(key, value) with no
      # options arg, so the expires: form raises ArgumentError.
      # Fall back to a plain write (no expiry) rather than letting
      # the fail-open rescue below silently drop every vector.
      @moneta[k] = encoded
    end
  else
    @moneta[k] = encoded
  end
  vector
rescue StandardError
  vector
end