Module: Parse::Retrieval::AgentTool
- Defined in:
- lib/parse/retrieval/agent_tool.rb
Overview
The semantic_search agent tool: the agent-aware wrapper around
retrieve. It applies the agent security
envelope that retrieve (a model-layer method) is
deliberately kept free of:
- Class allowlist via Agent::MetadataRegistry.resolve_searchable!
(
agent_searchableopt-in, hidden-class refusal, tenant-scope gate). - Recursive underscore-key refusal + filter-field allowlist on
caller-supplied
filter:/vector_filter:. - Tenant scope merged into the Atlas pre-filter AND re-asserted on every returned source record (NEW-TOOLS-3 guard).
field_allowlistprojection of each source record on the way out.- Score quantization in non-admin contexts.
ACL is enforced mongo-direct inside find_similar via the agent's
acl_scope_kwargs (session_token: / acl_user: / acl_role: /
master:), which is why the tool is client_safe: true: a
session-token client routes through the one path with first-class
SDK-side _rperm enforcement.
Constant Summary collapse
- MAX_K =
Upper bound on
k(mirrors the registered parameter schema). 20- DEFAULT_K =
Default neighbour count for the agent tool. Intentionally lower than Parse::Retrieval.retrieve's library default of 10: an LLM tool result is paid for in context tokens, so the agent surface defaults conservatively. Callers/LLMs can raise it up to MAX_K per call.
5- DEFAULT_MAX_TOTAL_TOKENS =
Default ceiling on total returned chunk-content tokens (estimated as chars/4). The retrieve count caps (k * max_chunks_per_document) bound the NUMBER of chunks but not their total size, so a few long documents could silently blow the context window. This budget trims the (score-ordered) chunk list and reports
budget_truncatedso the truncation is never silent. Passmax_total_tokens: 0to disable. 20_000- PARAMETERS =
JSON Schema for the registered tool's parameters.
{ "type" => "object", "properties" => { "class_name" => { "type" => "string", "description" => "Parse class name (must be agent_searchable)." }, "query" => { "type" => "string", "description" => "Natural-language query." }, "k" => { "type" => "integer", "default" => DEFAULT_K, "minimum" => 1, "maximum" => MAX_K }, "filter" => { "type" => "object", "description" => "Post-search field filter (allowlisted fields only)." }, "vector_filter" => { "type" => "object", "description" => "Atlas pre-search filter (allowlisted fields only)." }, "text_field" => { "type" => "string", "description" => "Which embedded text source to chunk and return as content. Required only when the class embeds more than one text field; must name one of those sources." }, "chunk_size" => { "type" => "integer", "description" => "Override chunk window size." }, "chunk_overlap" => { "type" => "integer", "description" => "Override chunk overlap." }, "chunk_by" => { "type" => "string", "enum" => %w[chars tokens], "description" => "Chunk unit." }, "max_chunks_per_document" => { "type" => "integer", "minimum" => 1, "description" => "Cap on chunks emitted per matched document." }, "max_total_tokens" => { "type" => "integer", "minimum" => 0, "description" => "Ceiling on total returned chunk-content tokens (approx chars/4). Trims lowest-ranked chunks first and sets budget_truncated. 0 disables." }, }, "required" => %w[class_name query], }.freeze
- OUTPUT_SCHEMA =
MCP outputSchema → mirrored as structuredContent on results. The parent record of each chunk is hoisted into
documents(keyed by objectId) rather than duplicated inline on every chunk; map a chunk to its source viametadata.object_id. { "type" => "object", "properties" => { "chunks" => { "type" => "array", "items" => { "type" => "object", "properties" => { "id" => { "type" => "string" }, "score" => { "type" => %w[number null] }, "content" => { "type" => "string" }, "metadata" => { "type" => "object" }, }, }, }, "documents" => { "type" => "object", "description" => "objectId => projected source record (sent once per matched document).", }, "count" => { "type" => "integer" }, "budget_truncated" => { "type" => "boolean", "description" => "Present when the token budget dropped lowest-ranked chunks." }, "budget_dropped" => { "type" => "integer", "description" => "Number of chunks dropped by the token budget." }, }, }.freeze
Class Method Summary collapse
-
.register! ⇒ Object
Register the tool.
-
.semantic_search(agent, class_name: nil, query: nil, k: DEFAULT_K, filter: nil, vector_filter: nil, text_field: nil, chunk_size: nil, chunk_overlap: nil, chunk_by: nil, max_chunks_per_document: nil, max_total_tokens: nil, klass: nil, size: nil, overlap: nil, by: nil, **rest) ⇒ Hash
{ chunks: Array<Hash>, documents: Hash, count: Integer }— each chunk's parent record is hoisted once intodocuments(keyed by objectId) instead of being duplicated on every chunk.
Class Method Details
.register! ⇒ Object
Register the tool. Idempotent-ish: re-requiring is a no-op because require caches; an explicit re-register after reset_registry! is supported via register!.
346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 |
# File 'lib/parse/retrieval/agent_tool.rb', line 346 def register! Parse::Agent::Tools.register( name: :semantic_search, description: "Find documents semantically similar to a natural-language query and " \ "return scored text chunks. Use when keyword matching is unlikely to " \ "work or the question needs synthesizing across documents. The target " \ "class must be declared `agent_searchable`.", parameters: PARAMETERS, permission: :readonly, timeout: 30, output_schema: OUTPUT_SCHEMA, client_safe: true, handler: ->(agent, **args) { Parse::Retrieval::AgentTool.semantic_search(agent, **args) }, ) end |
.semantic_search(agent, class_name: nil, query: nil, k: DEFAULT_K, filter: nil, vector_filter: nil, text_field: nil, chunk_size: nil, chunk_overlap: nil, chunk_by: nil, max_chunks_per_document: nil, max_total_tokens: nil, klass: nil, size: nil, overlap: nil, by: nil, **rest) ⇒ Hash
Returns { chunks: Array<Hash>, documents: Hash, count: Integer }
— each chunk's parent record is hoisted once into documents (keyed
by objectId) instead of being duplicated on every chunk. When the
token budget trims the result, budget_truncated: true and
budget_dropped: <n> are added.
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
# File 'lib/parse/retrieval/agent_tool.rb', line 62 def semantic_search(agent, class_name: nil, query: nil, k: DEFAULT_K, filter: nil, vector_filter: nil, text_field: nil, chunk_size: nil, chunk_overlap: nil, chunk_by: nil, max_chunks_per_document: nil, max_total_tokens: nil, # Back-compat / ergonomic aliases for direct callers: # `klass:`/`class:` for class_name, and the chunker's # own `size:`/`overlap:`/`by:` names. klass: nil, size: nil, overlap: nil, by: nil, **rest) class_name ||= klass || rest.delete(:class) chunk_size ||= size chunk_overlap ||= overlap chunk_by ||= by klass = Parse::Agent::MetadataRegistry.resolve_searchable!(class_name) cname = klass.parse_class unless query.is_a?(String) && !query.strip.empty? raise Parse::Agent::ValidationError, "semantic_search: `query` must be a non-empty String." end resolved_text_field = normalize_text_field!(text_field, klass) # Reject reserved underscore keys at any depth, then enforce the # per-class filter-field allowlist on top-level keys. Parse::Retrieval.assert_no_underscore_keys!(filter) unless filter.nil? Parse::Retrieval.assert_no_underscore_keys!(vector_filter) unless vector_filter.nil? allowed = Parse::Agent::MetadataRegistry.searchable_filter_fields(cname).map(&:to_s) assert_filter_fields_allowed!(filter, allowed) assert_filter_fields_allowed!(vector_filter, allowed) # Tenant scope (nil for unscoped classes / bypassed admins; raises # AccessDenied for an un-bound agent on a scoped class). scope = Parse::Agent::Tools.resolve_tenant_scope!(agent, cname) # Non-admin agents get quantized scores (membership-inference # defense); admin agents get full precision. Keyed on the # permission tier, not master-key posture. score_quantize = (agent. != :admin) vector_field = Parse::Agent::MetadataRegistry.searchable_field(cname) chunks = Parse::Retrieval.retrieve( query: query, klass: klass, field: vector_field, text_field: resolved_text_field, k: clamp_k(k), filter: filter, vector_filter: vector_filter, chunker: build_chunker(chunk_size, chunk_overlap, chunk_by, max_chunks_per_document), tenant_scope: scope, score_quantize: score_quantize, source_transform: source_projector(agent, cname, scope), **agent.acl_scope_kwargs, ) # Token budget (B4): trim the score-ordered chunk list before # building the envelope so `documents` only carries parents whose # chunks survived. kept, dropped = apply_token_budget(chunks, resolve_token_budget(max_total_tokens)) # Source dedup (A3): a document's (projected) source record is # identical across all its chunks. Hoist it into a `documents` map # keyed by objectId and drop the inline `source` from each chunk — # ~46 tok/chunk saved for every chunk past the first of a document. documents = {} chunk_hashes = kept.map do |chunk| h = chunk.to_h oid = h.dig(:metadata, :object_id) if oid && !oid.to_s.empty? documents[oid] ||= h[:source] h = h.reject { |key, _| key == :source } end h end stamp_chunk_provenance!(chunk_hashes, cname) if Parse::Agent.include_source_provenance? envelope = { chunks: chunk_hashes, documents: documents, count: chunk_hashes.length } if dropped > 0 envelope[:budget_truncated] = true envelope[:budget_dropped] = dropped end envelope end |