Module: Parse::Retrieval::AgentTool

Defined in:: lib/parse/retrieval/agent_tool.rb

Overview

The semantic_search agent tool: the agent-aware wrapper around retrieve. It applies the agent security envelope that retrieve (a model-layer method) is deliberately kept free of:

Class allowlist via Agent::MetadataRegistry.resolve_searchable! (agent_searchable opt-in, hidden-class refusal, tenant-scope gate).
Recursive underscore-key refusal + filter-field allowlist on caller-supplied filter: / vector_filter:.
Tenant scope merged into the Atlas pre-filter AND re-asserted on every returned source record (NEW-TOOLS-3 guard).
field_allowlist projection of each source record on the way out.
Score quantization in non-admin contexts.

ACL is enforced mongo-direct inside find_similar via the agent's acl_scope_kwargs (session_token: / acl_user: / acl_role: / master:), which is why the tool is client_safe: true: a session-token client routes through the one path with first-class SDK-side _rperm enforcement.

Constant Summary collapse

MAX_K = Upper bound on k (mirrors the registered parameter schema).

DEFAULT_K = Default neighbour count for the agent tool. Intentionally lower than Parse::Retrieval.retrieve's library default of 10: an LLM tool result is paid for in context tokens, so the agent surface defaults conservatively. Callers/LLMs can raise it up to MAX_K per call.

DEFAULT_MAX_TOTAL_TOKENS = Default ceiling on total returned chunk-content tokens (estimated as chars/4). The retrieve count caps (k * max_chunks_per_document) bound the NUMBER of chunks but not their total size, so a few long documents could silently blow the context window. This budget trims the (score-ordered) chunk list and reports budget_truncated so the truncation is never silent. Pass max_total_tokens: 0 to disable.

20_000

PARAMETERS = JSON Schema for the registered tool's parameters.

{
  "type" => "object",
  "properties" => {
    "class_name"    => { "type" => "string", "description" => "Parse class name (must be agent_searchable)." },
    "query"         => { "type" => "string", "description" => "Natural-language query." },
    "k"             => { "type" => "integer", "default" => DEFAULT_K, "minimum" => 1, "maximum" => MAX_K },
    "filter"        => { "type" => "object", "description" => "Post-search field filter (allowlisted fields only)." },
    "vector_filter" => { "type" => "object", "description" => "Atlas pre-search filter (allowlisted fields only)." },
    "text_field"    => { "type" => "string", "description" => "Which embedded text source to chunk and return as content. Required only when the class embeds more than one text field; must name one of those sources." },
    "chunk_size"    => { "type" => "integer", "description" => "Override chunk window size." },
    "chunk_overlap" => { "type" => "integer", "description" => "Override chunk overlap." },
    "chunk_by"      => { "type" => "string", "enum" => %w[chars tokens], "description" => "Chunk unit." },
    "max_chunks_per_document" => { "type" => "integer", "minimum" => 1, "description" => "Cap on chunks emitted per matched document." },
    "max_total_tokens" => { "type" => "integer", "minimum" => 0, "description" => "Ceiling on total returned chunk-content tokens (approx chars/4). Trims lowest-ranked chunks first and sets budget_truncated. 0 disables." },
  },
  "required" => %w[class_name query],
}.freeze

OUTPUT_SCHEMA = MCP outputSchema → mirrored as structuredContent on results. The parent record of each chunk is hoisted into documents (keyed by objectId) rather than duplicated inline on every chunk; map a chunk to its source via metadata.object_id.

{
  "type" => "object",
  "properties" => {
    "chunks" => {
      "type" => "array",
      "items" => {
        "type" => "object",
        "properties" => {
          "id"      => { "type" => "string" },
          "score"   => { "type" => %w[number null] },
          "content" => { "type" => "string" },
          "metadata" => { "type" => "object" },
        },
      },
    },
    "documents" => {
      "type" => "object",
      "description" => "objectId => projected source record (sent once per matched document).",
    },
    "count" => { "type" => "integer" },
    "budget_truncated" => { "type" => "boolean", "description" => "Present when the token budget dropped lowest-ranked chunks." },
    "budget_dropped" => { "type" => "integer", "description" => "Number of chunks dropped by the token budget." },
  },
}.freeze

Class Method Summary collapse

.register! ⇒ Object
Register the tool.
.semantic_search(agent, class_name: nil, query: nil, k: DEFAULT_K, filter: nil, vector_filter: nil, text_field: nil, chunk_size: nil, chunk_overlap: nil, chunk_by: nil, max_chunks_per_document: nil, max_total_tokens: nil, klass: nil, size: nil, overlap: nil, by: nil, **rest) ⇒ Hash
{ chunks: Array<Hash>, documents: Hash, count: Integer } — each chunk's parent record is hoisted once into documents (keyed by objectId) instead of being duplicated on every chunk.

Class Method Details

.register! ⇒ `Object`

Register the tool. Idempotent-ish: re-requiring is a no-op because require caches; an explicit re-register after reset_registry! is supported via register!.

# File 'lib/parse/retrieval/agent_tool.rb', line 346

def register!
  Parse::Agent::Tools.register(
    name: :semantic_search,
    description: "Find documents semantically similar to a natural-language query and " \
                 "return scored text chunks. Use when keyword matching is unlikely to " \
                 "work or the question needs synthesizing across documents. The target " \
                 "class must be declared `agent_searchable`.",
    parameters: PARAMETERS,
    permission: :readonly,
    timeout: 30,
    output_schema: OUTPUT_SCHEMA,
    client_safe: true,
    handler: ->(agent, **args) { Parse::Retrieval::AgentTool.semantic_search(agent, **args) },
  )
end

.semantic_search(agent, class_name: nil, query: nil, k: DEFAULT_K, filter: nil, vector_filter: nil, text_field: nil, chunk_size: nil, chunk_overlap: nil, chunk_by: nil, max_chunks_per_document: nil, max_total_tokens: nil, klass: nil, size: nil, overlap: nil, by: nil, **rest) ⇒ `Hash`

Returns { chunks: Array<Hash>, documents: Hash, count: Integer } — each chunk's parent record is hoisted once into documents (keyed by objectId) instead of being duplicated on every chunk. When the token budget trims the result, budget_truncated: true and budget_dropped: <n> are added.

Parameters:

agent (Parse::Agent)
text_field (String, Symbol, nil) (defaults to: nil) —
which embedded text source to chunk and return as content. Required only for models with more than one embed text source (otherwise inferred). Must name one of the class's declared embed sources — an arbitrary field is refused so the chunk content can't disclose a non-embedded field.
max_chunks_per_document (Integer, nil) (defaults to: nil) —
cap on chunks emitted per matched document (forwarded to the chunker).
max_total_tokens (Integer, nil) (defaults to: nil) —
ceiling on total returned chunk-content tokens (estimated chars/4). nil uses DEFAULT_MAX_TOTAL_TOKENS; 0 disables the budget.

Returns:

(Hash) —
{ chunks: Array<Hash>, documents: Hash, count: Integer } — each chunk's parent record is hoisted once into documents (keyed by objectId) instead of being duplicated on every chunk. When the token budget trims the result, budget_truncated: true and budget_dropped: <n> are added.

# File 'lib/parse/retrieval/agent_tool.rb', line 62

def semantic_search(agent, class_name: nil, query: nil, k: DEFAULT_K,
                    filter: nil, vector_filter: nil, text_field: nil,
                    chunk_size: nil, chunk_overlap: nil, chunk_by: nil,
                    max_chunks_per_document: nil, max_total_tokens: nil,
                    # Back-compat / ergonomic aliases for direct callers:
                    # `klass:`/`class:` for class_name, and the chunker's
                    # own `size:`/`overlap:`/`by:` names.
                    klass: nil, size: nil, overlap: nil, by: nil,
                    **rest)
  class_name    ||= klass || rest.delete(:class)
  chunk_size    ||= size
  chunk_overlap ||= overlap
  chunk_by      ||= by

  klass = Parse::Agent::MetadataRegistry.resolve_searchable!(class_name)
  cname = klass.parse_class

  unless query.is_a?(String) && !query.strip.empty?
    raise Parse::Agent::ValidationError, "semantic_search: `query` must be a non-empty String."
  end

  resolved_text_field = normalize_text_field!(text_field, klass)

  # Reject reserved underscore keys at any depth, then enforce the
  # per-class filter-field allowlist on top-level keys.
  Parse::Retrieval.assert_no_underscore_keys!(filter) unless filter.nil?
  Parse::Retrieval.assert_no_underscore_keys!(vector_filter) unless vector_filter.nil?
  allowed = Parse::Agent::MetadataRegistry.searchable_filter_fields(cname).map(&:to_s)
  assert_filter_fields_allowed!(filter, allowed)
  assert_filter_fields_allowed!(vector_filter, allowed)

  # Tenant scope (nil for unscoped classes / bypassed admins; raises
  # AccessDenied for an un-bound agent on a scoped class).
  scope = Parse::Agent::Tools.resolve_tenant_scope!(agent, cname)

  # Non-admin agents get quantized scores (membership-inference
  # defense); admin agents get full precision. Keyed on the
  # permission tier, not master-key posture.
  score_quantize = (agent.permissions != :admin)
  vector_field = Parse::Agent::MetadataRegistry.searchable_field(cname)

  chunks = Parse::Retrieval.retrieve(
    query: query,
    klass: klass,
    field: vector_field,
    text_field: resolved_text_field,
    k: clamp_k(k),
    filter: filter,
    vector_filter: vector_filter,
    chunker: build_chunker(chunk_size, chunk_overlap, chunk_by, max_chunks_per_document),
    tenant_scope: scope,
    score_quantize: score_quantize,
    source_transform: source_projector(agent, cname, scope),
    **agent.acl_scope_kwargs,
  )

  # Token budget (B4): trim the score-ordered chunk list before
  # building the envelope so `documents` only carries parents whose
  # chunks survived.
  kept, dropped = apply_token_budget(chunks, resolve_token_budget(max_total_tokens))

  # Source dedup (A3): a document's (projected) source record is
  # identical across all its chunks. Hoist it into a `documents` map
  # keyed by objectId and drop the inline `source` from each chunk —
  # ~46 tok/chunk saved for every chunk past the first of a document.
  documents = {}
  chunk_hashes = kept.map do |chunk|
    h = chunk.to_h
    oid = h.dig(:metadata, :object_id)
    if oid && !oid.to_s.empty?
      documents[oid] ||= h[:source]
      h = h.reject { |key, _| key == :source }
    end
    h
  end
  stamp_chunk_provenance!(chunk_hashes, cname) if Parse::Agent.include_source_provenance?

  envelope = { chunks: chunk_hashes, documents: documents, count: chunk_hashes.length }
  if dropped > 0
    envelope[:budget_truncated] = true
    envelope[:budget_dropped] = dropped
  end
  envelope
end