Module: Parse::Core::EmbedManaged
- Included in:
- Object
- Defined in:
- lib/parse/model/core/embed_managed.rb
Overview
Class-level embed macro for :vector properties.
Lets a model declare which scalar fields feed into a managed embedding, and arranges for that embedding to be computed automatically on save whenever the source fields change.
== Mechanics
The class macro:
- Validates that
into:names a declared:vectorproperty withprovider:metadata. - Auto-declares a
<into>_digest:stringsibling property (override withdigest_field:). - Registers a
before_savecallback that re-computes the embedding whenever the SHA-256 of the concatenated source fields differs from the stored digest. On first save the digest is blank and the embedding is always populated. On a save where no source field changed the digest matches and the callback is a no-op (zero provider calls). - Prepends a guard module that raises ProtectedFieldError on
direct
body_embedding=assignment from user code. The guard lifts only inside the managed write path (the before_save callback itself).
Provider calls flow through Embeddings.provider — the provider is resolved by name at save time, so registering a provider can happen any time before the first save. Declaration never makes a network call.
== Single vector per record
embed produces exactly one vector per record. All declared
source fields are concatenated (joined with "\n\n", blank values
skipped) and sent to the provider as a single string. This
directive is one-vector-per-record by design: long source text
whose concatenation exceeds the provider's per-call token budget
is truncated provider-side, and the stored vector represents only
the leading portion of the document.
Chunking happens at RETRIEVAL time, not embed time. As of v5.2 the
SDK ships Retrieval.retrieve and the semantic_search
agent tool, which fetch the top-k whole records and split each
record's text field into overlapping chunks for presentation
(every chunk inherits its parent record's single score). That is
presentation chunking — it does not change how embeddings are
computed here.
If you instead want each passage to have its OWN embedding (true embed-time chunking), keep one of these patterns:
- Pre-chunk client-side and write each chunk as its own
Parse::Object record with its own
embeddeclaration. - Maintain a dedicated chunk subclass that belongs_to the parent
record, with
embed :content, into: :embeddingon the chunk class itself.
Defined Under Namespace
Modules: ClassMethods Classes: EmbedDirective, InvalidEmbedDeclaration, ProtectedFieldError
Constant Summary collapse
- WRITER_KEY =
Internal: name of the Thread-local key under which the managed writer marks the symbol of the field it is currently writing. The guard module's setter checks this key to permit a single field write; the guard is otherwise closed.
:parse_embed_managed_writer
Instance Method Summary collapse
-
#compute_embedding!(field: nil) ⇒ self
Recompute this record's managed embedding(s) in-place, NOW, without a save.
Instance Method Details
#compute_embedding!(field: nil) ⇒ self
Recompute this record's managed embedding(s) in-place, NOW,
without a save. Runs the same digest-tracked recompute the
before_save callback runs: a provider call happens only when the
source text/URL changed since the last embed (digest miss). Useful
to populate the vector before inspecting it, or to force a refresh
in a console.
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 |
# File 'lib/parse/model/core/embed_managed.rb', line 143 def (field: nil) directives = self.class. if directives.empty? raise ArgumentError, "#{self.class}#compute_embedding!: no `embed` directives declared." end selected = if field d = directives[field.to_sym] unless d raise ArgumentError, "#{self.class}#compute_embedding!: :#{field} is not an embed target " \ "(have #{directives.keys.inspect})." end [d] else directives.values end selected.each { |directive| Parse::Core::EmbedManaged.(self, directive) } self end |