Module: Parse::Core::EmbedManaged

Included in:
Object
Defined in:
lib/parse/model/core/embed_managed.rb

Overview

Class-level embed macro for :vector properties.

Lets a model declare which scalar fields feed into a managed embedding, and arranges for that embedding to be computed automatically on save whenever the source fields change.

== Mechanics

The class macro:

  1. Validates that into: names a declared :vector property with provider: metadata.
  2. Auto-declares a <into>_digest :string sibling property (override with digest_field:).
  3. Registers a before_save callback that re-computes the embedding whenever the SHA-256 of the concatenated source fields differs from the stored digest. On first save the digest is blank and the embedding is always populated. On a save where no source field changed the digest matches and the callback is a no-op (zero provider calls).
  4. Prepends a guard module that raises ProtectedFieldError on direct body_embedding= assignment from user code. The guard lifts only inside the managed write path (the before_save callback itself).

Provider calls flow through Embeddings.provider — the provider is resolved by name at save time, so registering a provider can happen any time before the first save. Declaration never makes a network call.

== Single vector per record

embed produces exactly one vector per record. All declared source fields are concatenated (joined with "\n\n", blank values skipped) and sent to the provider as a single string. This directive is one-vector-per-record by design: long source text whose concatenation exceeds the provider's per-call token budget is truncated provider-side, and the stored vector represents only the leading portion of the document.

Chunking happens at RETRIEVAL time, not embed time. As of v5.2 the SDK ships Retrieval.retrieve and the semantic_search agent tool, which fetch the top-k whole records and split each record's text field into overlapping chunks for presentation (every chunk inherits its parent record's single score). That is presentation chunking — it does not change how embeddings are computed here.

If you instead want each passage to have its OWN embedding (true embed-time chunking), keep one of these patterns:

  1. Pre-chunk client-side and write each chunk as its own Parse::Object record with its own embed declaration.
  2. Maintain a dedicated chunk subclass that belongs_to the parent record, with embed :content, into: :embedding on the chunk class itself.

Examples:

class Document < Parse::Object
  property :title, :string
  property :body,  :string
  property :body_embedding, :vector, dimensions: 1536, provider: :openai
  embed :title, :body, into: :body_embedding
end

doc = Document.new(title: "hello", body: "world")
doc.save   # provider :openai is called once; body_embedding populated

Defined Under Namespace

Modules: ClassMethods Classes: EmbedDirective, InvalidEmbedDeclaration, ProtectedFieldError

Constant Summary collapse

WRITER_KEY =

Internal: name of the Thread-local key under which the managed writer marks the symbol of the field it is currently writing. The guard module's setter checks this key to permit a single field write; the guard is otherwise closed.

:parse_embed_managed_writer

Instance Method Summary collapse

Instance Method Details

#compute_embedding!(field: nil) ⇒ self

Recompute this record's managed embedding(s) in-place, NOW, without a save. Runs the same digest-tracked recompute the before_save callback runs: a provider call happens only when the source text/URL changed since the last embed (digest miss). Useful to populate the vector before inspecting it, or to force a refresh in a console.

Parameters:

  • field (Symbol, nil) (defaults to: nil)

    limit to one embed target; nil recomputes every declared directive.

Returns:

  • (self)

Raises:

  • (ArgumentError)

    when field: names no embed target, or the class declares no embed directives.



143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
# File 'lib/parse/model/core/embed_managed.rb', line 143

def compute_embedding!(field: nil)
  directives = self.class.embed_directives
  if directives.empty?
    raise ArgumentError, "#{self.class}#compute_embedding!: no `embed` directives declared."
  end
  selected =
    if field
      d = directives[field.to_sym]
      unless d
        raise ArgumentError,
              "#{self.class}#compute_embedding!: :#{field} is not an embed target " \
              "(have #{directives.keys.inspect})."
      end
      [d]
    else
      directives.values
    end
  selected.each { |directive| Parse::Core::EmbedManaged.recompute_embedding!(self, directive) }
  self
end