Module: Parse::Core::VectorSearchable

Included in:
Object
Defined in:
lib/parse/model/core/vector_searchable.rb

Overview

Class-level ‘find_similar` wrapper around VectorSearch.search for any Parse::Object subclass that has declared at least one `:vector` property.

The wrapper handles three things the low-level entry point doesn’t:

  1. **Field resolution.** Defaults to the subclass’s single ‘:vector` property; raises if the class has none, requires explicit `field:` if it has more than one.

  2. **Declared-dimension validation.** Compares the query vector’s length against the ‘dimensions:` declared on the property, so callers get “expected 1536, got 768” instead of an Atlas- side error after a round-trip.

  3. **Index auto-discovery.** Looks up the Atlas vectorSearch index covering the field via AtlasSearch::IndexManager.find_vector_index when no explicit ‘index:` kwarg is given.

ACL/CLP enforcement is inherited from VectorSearch.search (which routes through MongoDB — REST ‘/aggregate` is master-key-only and bypasses ACL/CLP, see CLAUDE.md). The full scope-kwarg surface (`session_token:`, `master:`, `acl_user:`, `acl_role:`) is forwarded as-is.

Examples:

default field, default index

WikiArticle.find_similar(vector: query_embedding, k: 5)

explicit field + post-filter, scoped to a session

Document.find_similar(
  vector: embed.call("ruby parse"),
  field: :body_embedding,
  k: 10,
  filter: { tag: "ruby" },
  session_token: user.session_token,
)

Defined Under Namespace

Classes: AmbiguousVectorField, EmbedderNotConfigured, IndexNotResolved, NoVectorProperty

Instance Method Summary collapse

Instance Method Details

#find_similar(vector: nil, text: nil, k: 10, field: nil, filter: nil, vector_filter: nil, index: nil, num_candidates: nil, max_time_ms: nil, raw: false, **scope_opts) ⇒ Array<Parse::Object>, Array<Hash>

Note:

When ‘text:` is given, the text is sent over the wire to the embedding provider (e.g. OpenAI). Operators that enable global Faraday request logging on the embedding connection will capture the full query text in the JSON request body. Treat `text:` as user-visible content for log-handling purposes.

Note:

The provider is responsible for bounding its own request timeout. Embeddings::OpenAI self-bounds at 30 s read / 5 s connect with capped retries. Custom providers MUST self-bound — ‘find_similar` does not impose a wall-clock deadline on the embed step.

Find documents whose declared ‘:vector` property is closest to `vector:` under the Atlas vectorSearch index’s similarity function.

Parameters:

  • vector (Array<Float>, Parse::Vector, nil) (defaults to: nil)

    the query embedding. Mutually exclusive with ‘text:` — exactly one of the two must be given.

  • text (String, nil) (defaults to: nil)

    natural-language query. When given, the resolved field’s declared ‘provider:` is looked up via Embeddings.provider, used to embed `[text]` with `input_type: :search_query`, and the resulting vector is used in place of `vector:`. Requires the property to have been declared with `provider:` metadata.

  • k (Integer) (defaults to: 10)

    number of hits to return. Default 10.

  • field (Symbol, String, nil) (defaults to: nil)

    the ‘:vector` property to search. Auto-resolves when the class has exactly one `:vector` property.

  • filter (Hash, nil) (defaults to: nil)

    post-‘$vectorSearch` `$match` filter.

  • vector_filter (Hash, nil) (defaults to: nil)

    Atlas-native pre-search filter (fields must be declared ‘type: “filter”` in the index).

  • index (String, nil) (defaults to: nil)

    explicit vectorSearch index name. Skips auto-discovery when given.

  • num_candidates (Integer, nil) (defaults to: nil)

    HNSW search width.

  • max_time_ms (Integer, nil) (defaults to: nil)

    server-side timeout.

  • raw (Boolean) (defaults to: false)

    when true return the raw Mongo documents (each enriched with ‘_vscore`); when false (default) build instances of the calling class and attach `vector_score`.

  • scope_opts (Hash)

    ACL/CLP scope kwargs forwarded to VectorSearch.search: ‘session_token:`, `master:`, `acl_user:`, `acl_role:`.

Returns:

  • (Array<Parse::Object>, Array<Hash>)

    hits in descending-similarity order. Each instance responds to ‘vector_score` (the Atlas `vectorSearchScore`).

Raises:



129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# File 'lib/parse/model/core/vector_searchable.rb', line 129

def find_similar(vector: nil, text: nil, k: 10, field: nil, filter: nil,
                 vector_filter: nil, index: nil,
                 num_candidates: nil, max_time_ms: nil, raw: false,
                 **scope_opts)
  if vector.nil? && text.nil?
    raise ArgumentError,
          "#{self}.find_similar: must pass either `vector:` or `text:`."
  end
  if !vector.nil? && !text.nil?
    raise ArgumentError,
          "#{self}.find_similar: pass either `vector:` or `text:`, not both."
  end

  resolved_field = resolve_vector_field!(field)
  declared_dims = vector_properties.dig(resolved_field, :dimensions)

  query_vector =
    if text.nil?
      coerce_query_vector(vector)
    else
      embed_query_text!(text, resolved_field)
    end
  Parse::VectorSearch.validate_query_vector!(query_vector, dimensions: declared_dims)

  index_name = resolve_vector_index!(resolved_field, index)

  raw_hits = Parse::VectorSearch.search(
    parse_class,
    field: resolved_field,
    query_vector: query_vector,
    k: k,
    num_candidates: num_candidates,
    filter: filter,
    vector_filter: vector_filter,
    index: index_name,
    max_time_ms: max_time_ms,
    **scope_opts,
  )

  return raw_hits if raw
  build_vector_hits(raw_hits)
end