Module: Parse::Agent::Tools

Extended by:
Tools
Included in:
Tools
Defined in:
lib/parse/agent/tools.rb

Overview

The Tools module contains all the executable tool implementations for the Parse Agent. Each tool is a class method that takes an agent instance and keyword arguments.

Tools are divided into categories:

  • Schema tools: get_all_schemas, get_schema
  • Query tools: query_class, count_objects, get_object, get_sample_objects, get_objects
  • Analysis tools: aggregate, explain_query

== Custom Tool Registration

Third-party apps may register additional tools:

Parse::Agent::Tools.register( name: :breakdown_posts, description: "Count posts grouped by author/...", parameters: { type: "object", properties: ..., required: [...] }, permission: :readonly, timeout: 30, handler: ->(agent, **args) { { result: "..." } } )

Registering a name that matches an existing registration replaces it (idempotent on name). Call reset_registry! to clear all registrations (useful in test suites).

Constant Summary collapse

BLOCKED_METHODS =

Methods that are dangerous and should never be invoked via tools. Defined here (rather than MCPServer) so it's always available.

%w[
  eval exec system ` send __send__ public_send
  instance_eval class_eval module_eval
  instance_exec class_exec module_exec
  define_method define_singleton_method remove_method undef_method
  singleton_class
  open fork spawn syscall load require require_relative
  const_get const_set remove_const method binding
  instance_variable_set instance_variable_get
].freeze
DEFAULT_TIMEOUT =

Default timeout for tool operations (seconds)

30
TOOL_TIMEOUTS =

Per-tool timeout overrides for long-running operations. Frozen — do not mutate. Use Tools.timeout_for(name) to resolve timeouts that overlay registered-tool values on top of this table.

{
  aggregate: 60,
  query_class: 30,
  explain_query: 30,
  call_method: 60,
  get_all_schemas: 15,
  get_schema: 10,
  count_objects: 20,
  get_object: 10,
  get_objects: 20,
  get_sample_objects: 15,
  export_data: 60,
  group_by: 45,
  group_by_date: 45,
  distinct: 30,
  list_tools: 5,
  atlas_text_search: 30,
  atlas_autocomplete: 15,
  atlas_faceted_search: 45,
}.freeze
ATLAS_LIMIT_DEFAULT =

Per-tool clamps for Atlas Search result counts. The hard cap (+ATLAS_LIMIT_MAX+) keeps response sizes predictable when an LLM agent asks for "everything"; the default keeps token spend reasonable when an agent forgets to set +limit:+.

10
ATLAS_LIMIT_MAX =
20
TOOL_DEFINITIONS =

Tool definitions in OpenAI function calling format Optimized for token efficiency - LLMs understand from context

{
  list_tools: {
    category: "discovery",
    name: "list_tools",
    description: "Return a lightweight catalog of tools available to this agent — name, category, and a " \
                 "one-line description per tool. Use this for discovery; call tools/list (or a specific " \
                 "tool's documentation) for full input schemas. Pass category: to narrow the catalog to a " \
                 "single category (e.g. 'schema', 'query', 'aggregate', 'mutation', 'export'). The " \
                 "response also includes a `categories` map summarizing what each built-in category covers.",
    parameters: {
      type: "object",
      properties: {
        category: {
          type: "string",
          description: "Optional. Restrict the catalog to tools in this category (case-insensitive). " \
                       "Built-in categories: schema, query, aggregate, mutation, export.",
        },
      },
      required: [],
    },
    output_schema: {
      type: "object",
      properties: {
        tools: {
          type: "array",
          items: {
            type: "object",
            properties: {
              name:        { type: "string" },
              category:    { type: "string" },
              description: { type: "string" },
            },
            required: %w[name category description],
          },
        },
        categories: {
          type: "object",
          additionalProperties: { type: "string" },
        },
      },
      required: %w[tools categories],
    },
  },

  get_all_schemas: {
    category: "schema",
    name: "get_all_schemas",
    description: "List every Parse class the agent can see, with field counts and optional descriptions. " \
                 "Call this first when exploring an unfamiliar database. Use get_schema next for a specific class. " \
                 "Pass names: to fetch only a known subset of classes, or prefix: to filter to class names starting " \
                 "with a given string — both reduce response size on large catalogs.",
    parameters: {
      type: "object",
      properties: {
        names:  { type: "array", items: { type: "string" },
                  description: "Optional. Restrict the output to these class names (exact match)." },
        prefix: { type: "string",
                  description: "Optional. Restrict the output to class names that start with this prefix (case-sensitive)." },
      },
      required: [],
    },
    output_schema: {
      type: "object",
      properties: {
        total:    { type: "integer", minimum: 0 },
        note:     { type: "string" },
        built_in: {
          type: "array",
          items: {
            type: "object",
            properties: {
              name:    { type: "string" },
              fields:  { type: "integer", minimum: 0 },
              desc:    { type: "string" },
              methods: { type: "integer", minimum: 0 },
            },
            required: %w[name fields],
            additionalProperties: true,
          },
        },
        custom: {
          type: "array",
          items: {
            type: "object",
            properties: {
              name:    { type: "string" },
              fields:  { type: "integer", minimum: 0 },
              desc:    { type: "string" },
              methods: { type: "integer", minimum: 0 },
            },
            required: %w[name fields],
            additionalProperties: true,
          },
        },
      },
      required: %w[total built_in custom],
    },
  },

  get_schema: {
    category: "schema",
    name: "get_schema",
    description: "Return the fields, types, indexes, permissions, and relations for a single Parse class. " \
                 "Inspect the `large_field: true` annotation on individual fields — those are known-heavy " \
                 "columns that should be projected away with `keys:` in subsequent query_class calls to " \
                 "stay under the response-size cap. " \
                 "When the response contains a top-level `agent_fields:` list, those are the only " \
                 "wire-format names accepted by query/aggregate tools for this class; storage-form " \
                 "columns (e.g. `_p_*` pointer columns) and other Parse-internal underscored fields " \
                 "are never addressable. A field's `allowed_values:` array, when present, enumerates " \
                 "the per-value documentation for an enum-shaped string column.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
      },
      required: ["class_name"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name:  { type: "string" },
        type:        { type: "string" },
        description: { type: "string" },
        usage:       { type: "string" },
        fields:      { type: "array", items: { type: "object", additionalProperties: true } },
        indexes:     { type: "object", additionalProperties: true },
        permissions: { type: "object", additionalProperties: true },
        agent_methods:     { type: "array", items: { type: "object", additionalProperties: true } },
        canonical_filter:  { type: "object", additionalProperties: true },
        agent_fields:      { type: "array", items: { type: "string" } },
        agent_join_fields: { type: "array", items: { type: "string" } },
        relations:         { type: "object", additionalProperties: true },
      },
      required: %w[class_name type fields indexes permissions],
    },
  },

  query_class: {
    category: "query",
    name: "query_class",
    description: "Fetch records from a Parse class with optional where: constraints, ordering, and pagination. " \
                 "Use this when you actually need the record content. When to use a DIFFERENT tool instead: " \
                 "count_objects for cardinality only (cheaper, no row cost); get_object for a single known " \
                 "objectId (faster, no projection needed); aggregate for groupings, statistics, or cross-class " \
                 "$lookup joins; get_sample_objects to peek at the shape of data you have not seen before. " \
                 "Default limit 100, max 1000. Pass keys: to project specific fields and avoid the 4 MiB " \
                 "response-size cap on wide-schema classes. " \
                 "When a class declares an `agent_canonical_filter` (a per-class 'valid state' predicate, " \
                 "e.g. soft-delete exclusion), it is applied by default; pass `apply_canonical_filter: false` " \
                 "to opt out (use `get_schema(class_name)` to see the filter). " \
                 "Pass `format: \"csv\"|\"markdown\"|\"table\"` to receive a formatted text payload instead of " \
                 "the row envelope — useful when the caller wants to forward the output as-is to a human. " \
                 "For more advanced formatting (column aliasing, dotted paths, custom row caps) use export_data.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        where: { type: "object" },
        limit: { type: "integer" },
        skip: { type: "integer" },
        order: { type: "string" },
        keys: { type: "array", items: { type: "string" } },
        include: { type: "array", items: { type: "string" } },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, it is merged into the where via " \
                                               "$and so the caller's constraints compose rather than override." },
        format: { type: "string", enum: %w[json csv markdown table],
                  description: "Output format. Defaults to 'json' (structured row envelope). When set to " \
                               "csv/markdown/table the response carries {format:, headers:, row_count:, output:} " \
                               "instead of the row envelope; columns are inferred from the first row." },
      },
      required: ["class_name"],
    },
    # Polymorphic envelope: the default `format: "json"` returns a row
    # envelope (class_name, result_count, pagination, results, ...);
    # `format: "csv"|"markdown"|"table"` returns a text envelope
    # (class_name, format, headers, row_count, output). Declared here as
    # a permissive superset (every key from either envelope is optional
    # except class_name) because MCP 2025-06-18 expects type:object at
    # the outputSchema root, which precludes a top-level oneOf. Clients
    # that need to disambiguate inspect `format` (absent for the json
    # envelope, present for text envelopes).
    output_schema: {
      type: "object",
      properties: {
        class_name:    { type: "string" },
        # json envelope
        result_count:  { type: "integer", minimum: 0 },
        pagination: {
          type: "object",
          properties: {
            limit:    { type: "integer", minimum: 0 },
            skip:     { type: "integer", minimum: 0 },
            has_more: { type: "boolean" },
          },
          required: %w[limit skip has_more],
        },
        truncated:                { type: "boolean" },
        truncated_note:           { type: "string" },
        truncated_include_fields: { type: "object", additionalProperties: true },
        next_call: {
          type: "object",
          properties: {
            tool:      { type: "string" },
            arguments: { type: "object", additionalProperties: true },
          },
          required: %w[tool arguments],
        },
        results: { type: "array", items: { type: "object", additionalProperties: true } },
        # csv / markdown / table envelope
        format:    { type: "string", enum: %w[csv markdown table] },
        headers:   { type: "array", items: { type: "string" } },
        row_count: { type: "integer", minimum: 0 },
        output:    { type: "string" },
      },
      required: %w[class_name],
    },
  },

  count_objects: {
    category: "query",
    name: "count_objects",
    description: "Return the count of records matching a where: constraint, WITHOUT returning the records " \
                 "themselves. Always prefer this over query_class(...).length when you only need a number — " \
                 "it is dramatically cheaper for both latency and response size. Call this before a large " \
                 "query to decide whether to paginate or narrow the where:. " \
                 "When a class declares an `agent_canonical_filter` it is applied by default; pass " \
                 "`apply_canonical_filter: false` to count the full collection including soft-deleted rows.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        where: { type: "object" },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. Set to false to count ignoring the class's canonical filter." },
      },
      required: ["class_name"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name:  { type: "string" },
        count:       { type: "integer", minimum: 0 },
        constraints: { type: "object" },
      },
      required: %w[class_name count constraints],
    },
  },

  get_object: {
    category: "query",
    name: "get_object",
    description: "Fetch a single record by its objectId. Use this instead of query_class(where: {objectId: 'x'}) " \
                 "— it is faster and projects cleanly. For multiple known ids in one call, use get_objects instead. " \
                 "When the class declares an `agent_canonical_filter`, it is applied by default; pass " \
                 "`apply_canonical_filter: false` to fetch the row regardless of the 'valid state' predicate.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        object_id: { type: "string" },
        include: { type: "array", items: { type: "string" } },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, the fetch is rewritten as a " \
                                               "find_objects with where: { objectId: id, ...filter } " \
                                               "so a filtered-out row returns 'not found'. Set to false " \
                                               "to bypass the predicate and fetch the row directly." },
      },
      required: ["class_name", "object_id"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        object_id:  { type: "string" },
        created_at: { type: %w[string null] },
        updated_at: { type: %w[string null] },
        object:     { type: "object" },
        truncated_include_fields: { type: "object" },
      },
      required: %w[class_name object_id object],
    },
  },

  get_objects: {
    category: "query",
    name: "get_objects",
    description: "Batch-fetch multiple Parse records by objectId in a single round trip. Use instead of " \
                 "issuing N separate get_object calls when you already know the ids (e.g. after resolving a " \
                 "list of pointers). Hard cap: 50 ids per call (deduplicated). For larger sets call " \
                 "query_class with `where: { 'objectId' => { '$in' => [...] } }`. " \
                 "When the class declares an `agent_canonical_filter`, it is applied by default; pass " \
                 "`apply_canonical_filter: false` to fetch the rows regardless of the 'valid state' predicate.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string", description: "Parse class name" },
        ids: { type: "array", items: { type: "string" }, description: "Array of objectId values (max 50, dedup'd)" },
        include: { type: "array", items: { type: "string" }, description: "Pointer fields to include/resolve" },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, it composes with the objectId $in " \
                                               "constraint via $and so 'invalid state' rows are filtered out " \
                                               "(they appear in the :missing array). Set to false to bypass." },
      },
      required: ["class_name", "ids"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        objects: {
          type: "object",
          additionalProperties: { type: "object" },
          description: "Map of objectId => fetched object. Empty when no ids resolved.",
        },
        missing:   { type: "array", items: { type: "string" } },
        requested: { type: "integer", minimum: 0 },
        found:     { type: "integer", minimum: 0 },
        truncated_include_fields: { type: "object" },
      },
      required: %w[class_name objects missing requested found],
    },
  },

  get_sample_objects: {
    category: "query",
    name: "get_sample_objects",
    description: "Return a small number (default 5, max 20) of the most recently created records from a class. " \
                 "Use this for schema exploration — 'what does the data in this class actually look like?'. " \
                 "Do NOT use this to retrieve specific records (use query_class with where:) or to count " \
                 "(use count_objects).",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        limit: { type: "integer" },
      },
      required: ["class_name"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name:   { type: "string" },
        sample_count: { type: "integer", minimum: 0 },
        samples:      { type: "array", items: { type: "object" } },
        note:         { type: "string" },
      },
      required: %w[class_name sample_count samples],
    },
  },

  aggregate: {
    category: "aggregate",
    name: "aggregate",
    description: "Run a MongoDB aggregation pipeline against a Parse class. Use this for operations " \
                 "query_class cannot express: grouping ($group), statistics ($sum/$avg/$min/$max), " \
                 "joins to other classes ($lookup), faceting ($facet), or any multi-stage transformation. " \
                 "Server-side JavaScript operators ($where, $function, $accumulator, $out, $merge) are " \
                 "always refused. The result is auto-bounded to 200 rows unless your pipeline ends with an " \
                 "explicit $limit or $count stage — append $count to get a single scalar instead of a row dump. " \
                 "Pointer columns are compressed by default: `_p_<field>: \"<Class>$<id>\"` rows are " \
                 "rewritten to `<field>: \"<id>\"` and the response envelope carries a " \
                 "`pointer_classes: { <field>: <Class> }` map. Pass `compact_pointers: false` to opt out " \
                 "and receive raw Parse-on-Mongo storage shapes.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        pipeline:   { type: "array", items: { type: "object" } },
        compact_pointers: { type: "boolean",
                            description: "Default true. When true, storage-form pointer columns (`_p_*`) are " \
                                         "rewritten and the envelope carries a `pointer_classes` map. Set to " \
                                         "false to receive raw Mongo shapes." },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, it is prepended as a $match stage so " \
                                               "the pipeline starts from the class's 'valid state' subset. " \
                                               "Set to false to operate on the full collection." },
      },
      required: ["class_name", "pipeline"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name:      { type: "string" },
        pipeline_stages: { type: "integer", minimum: 0 },
        result_count:    { type: "integer", minimum: 0 },
        # `route` is :mongo_direct or :parse_server but serializes
        # to a Symbol-shaped String in JSON envelopes; declare it
        # permissively as string.
        route:           { type: "string", description: "Routing tag: 'mongo_direct' or 'parse_server'." },
        # Aggregation result rows are class-shape-dependent and may
        # be the output of arbitrary $project / $group / $lookup
        # stages. Object envelopes with open property sets are the
        # honest representation.
        results: {
          type: "array",
          items: { type: "object", additionalProperties: true },
        },
        pointer_classes: {
          type: "object",
          additionalProperties: { type: "string" },
          description: "Optional. Field-name → Parse-class-name map when compact_pointers is on.",
        },
        auto_limited: { type: "boolean" },
        auto_limit:   { type: "integer", minimum: 1 },
        hint:         { type: "string" },
      },
      required: %w[class_name pipeline_stages result_count route results],
    },
  },

  explain_query: {
    category: "query",
    name: "explain_query",
    description: "Return the MongoDB execution plan for a where: query WITHOUT running the query itself. " \
                 "Use this to debug an unexpectedly slow query or before issuing a potentially expensive one " \
                 "against a large class. May refuse the plan when the global refuse_collscan flag is set and " \
                 "the query would full-scan the collection — the refusal itself tells you the where: needs a " \
                 "filter on an indexed field.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        where: { type: "object" },
      },
      required: ["class_name"],
    },
  },

  call_method: {
    category: "mutation",
    name: "call_method",
    description: "Invoke a developer-declared `agent_method` on a Parse::Object class (class-level) or " \
                 "instance (when object_id is supplied). Use this for domain actions the read tools cannot " \
                 "express — e.g. 'set_client_description', 'archive_user', 'recalculate_totals' — that the " \
                 "application has explicitly opted into by declaring the method with the agent_method DSL. " \
                 "The target class controls validation, normalization, and side effects (notifications, " \
                 "cron, cache invalidation). PREFER this over raw create_object / update_object whenever a " \
                 "purpose-built method exists. Permission is per-method (:readonly / :write / :admin); the " \
                 "tool itself is in :readonly because the per-method gate is the real boundary.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        method_name: { type: "string" },
        object_id: { type: "string" },
        arguments: { type: "object" },
      },
      required: ["class_name", "method_name"],
    },
  },

  group_by: {
    category: "aggregate",
    name: "group_by",
    description: "Group records by a field and aggregate (count/sum/avg/min/max). " \
                 "Use this instead of hand-writing a $group pipeline through aggregate — " \
                 "the tool auto-detects pointer fields, surfaces the pointer class in an " \
                 "envelope key, optionally flattens array fields with $unwind, and supports " \
                 "sort=value_desc/value_asc/key_desc/key_asc for top-K queries. " \
                 "Default operation is 'count'; for sum/avg/min/max pass value_field. " \
                 "Bounded to 200 groups by default (top-K friendly when sort is set).",
    parameters: {
      type: "object",
      properties: {
        class_name:     { type: "string" },
        field:          { type: "string", description: "Field to group by (wire-format name; pointers auto-detected)" },
        operation:      {
          type: "string",
          enum: %w[count sum avg average min max],
          description: "Aggregation to apply per group. Default: count.",
        },
        value_field:    { type: "string", description: "Required for sum/avg/min/max — the field to aggregate within each group." },
        where:          { type: "object", description: "Optional constraints applied via $match before grouping." },
        flatten_arrays: { type: "boolean", description: "When true, $unwind the field before grouping so individual array elements are counted." },
        sort:           {
          type: "string",
          enum: %w[value_desc value_asc key_desc key_asc],
          description: "Sort the result. Use value_desc for top-K. Default: server-natural order.",
        },
        limit:          { type: "integer", description: "Cap the number of groups returned. Default: 200, max: 1000." },
        dry_run:        { type: "boolean", description: "When true, return the constructed MongoDB pipeline without executing it. Use to inspect / hand-modify before running via the aggregate tool." },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, it is prepended as a $match stage " \
                                               "so the group operates only on the class's 'valid state' " \
                                               "subset. Set to false to group across the full collection." },
      },
      required: ["class_name", "field"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name:     { type: "string" },
        field:          { type: "string" },
        operation:      { type: "string" },
        group_count:    { type: "integer", minimum: 0 },
        groups: {
          type: "array",
          items: {
            type: "object",
            properties: {
              key:   {},
              value: { type: %w[number null] },
            },
            required: %w[key value],
          },
        },
        value_field:    { type: "string" },
        pointer_class:  { type: "string" },
        flatten_arrays: { type: "boolean" },
        sort:           { type: "string" },
        truncated:      { type: "boolean" },
        limit:          { type: "integer" },
      },
      required: %w[class_name field operation group_count groups limit],
    },
  },

  group_by_date: {
    category: "aggregate",
    name: "group_by_date",
    description: "Group records by a date field bucketed at a specified interval " \
                 "(year/month/week/day/hour/minute/second), with count/sum/avg/min/max. " \
                 "Use this for any time-series breakdown — it builds the correct " \
                 "$year/$month/$day/$hour/$minute/$second expressions, honors an optional " \
                 "timezone (e.g. 'America/New_York'), formats the keys as ISO date strings " \
                 "(YYYY, YYYY-MM, YYYY-MM-DD, etc.), and defaults to chronological order. " \
                 "Bounded to 200 buckets by default.",
    parameters: {
      type: "object",
      properties: {
        class_name:  { type: "string" },
        field:       { type: "string", description: "Date field to bucket on (e.g. 'createdAt', 'updatedAt', or a custom Date column)." },
        interval:    {
          type: "string",
          enum: %w[year month week day hour minute second],
          description: "Bucket size.",
        },
        operation:   {
          type: "string",
          enum: %w[count sum avg average min max],
          description: "Aggregation per bucket. Default: count.",
        },
        value_field: { type: "string", description: "Required for sum/avg/min/max — the field to aggregate within each bucket." },
        where:       { type: "object", description: "Optional constraints applied via $match before grouping." },
        timezone:    { type: "string", description: "IANA tz name (e.g. 'America/New_York') or fixed offset ('+05:00'). Default: UTC." },
        sort:        {
          type: "string",
          enum: %w[key_asc key_desc value_asc value_desc],
          description: "Sort the result. Default: key_asc (chronological).",
        },
        limit:       { type: "integer", description: "Cap the number of buckets returned. Default: 200, max: 1000." },
        dry_run:     { type: "boolean", description: "When true, return the constructed MongoDB pipeline without executing it." },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, it is prepended as a $match stage " \
                                               "so the buckets reflect only the class's 'valid state' " \
                                               "subset. Set to false to bucket the full collection." },
      },
      required: ["class_name", "field", "interval"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name:   { type: "string" },
        field:        { type: "string" },
        interval:     { type: "string" },
        operation:    { type: "string" },
        group_count:  { type: "integer", minimum: 0 },
        groups: {
          type: "array",
          items: {
            type: "object",
            properties: {
              key:   { type: %w[string null] },
              value: { type: %w[number null] },
            },
            required: %w[key value],
          },
        },
        value_field:  { type: "string" },
        timezone:     { type: "string" },
        sort:         { type: "string" },
        truncated:    { type: "boolean" },
        limit:        { type: "integer" },
      },
      required: %w[class_name field interval operation group_count groups sort limit],
    },
  },

  distinct: {
    category: "aggregate",
    name: "distinct",
    description: "Return the distinct values of a field, optionally filtered by where:. " \
                 "When the field is a pointer, the response strips the 'ClassName$' prefix " \
                 "from each value and surfaces the class once in the pointer_class envelope " \
                 "key — call get_objects with the returned ids if you need full records. " \
                 "Bounded to 1000 distinct values by default.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        field:      { type: "string", description: "Field to extract distinct values from (wire-format name; pointers auto-detected)." },
        where:      { type: "object", description: "Optional constraints applied via $match before distinct." },
        sort:       {
          type: "string",
          enum: %w[asc desc],
          description: "Sort the returned values alphanumerically. Default: server-natural order.",
        },
        limit:      { type: "integer", description: "Cap the number of distinct values returned. Default: 1000, max: 5000." },
        dry_run:    { type: "boolean", description: "When true, return the constructed MongoDB pipeline without executing it." },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, it is prepended as a $match stage " \
                                               "so values are extracted only from the class's 'valid state' " \
                                               "subset. Set to false to extract across the full collection." },
      },
      required: ["class_name", "field"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name:    { type: "string" },
        field:         { type: "string" },
        count:         { type: "integer", minimum: 0 },
        values:        { type: "array",
                         items: { type: %w[string number boolean null] } },
        pointer_class: { type: "string" },
        sort:          { type: "string" },
        truncated:     { type: "boolean" },
        limit:         { type: "integer" },
      },
      required: %w[class_name field count values limit],
    },
  },

  export_data: {
    category: "export",
    name: "export_data",
    description: "Export Parse data as CSV (default), Markdown table, or fixed-width text table. " \
                 "Use 'query' mode for simple class fetches (where/keys/order/limit) or 'aggregate' " \
                 "mode for grouped/joined queries via a MongoDB pipeline. " \
                 "Supports column aliasing via columns: — pass a string to use the field as-is, " \
                 "or {field: 'Header Name'} to rename. Dotted paths (e.g. 'subject.name') extract " \
                 "nested values when used with include:. " \
                 "IMPORTANT: hard-capped at 1000 rows by default. For large exports, first call " \
                 "count_objects to size the result, then either narrow with where:/filter pipeline " \
                 "stages or call export_data multiple times with explicit limit: and skip:. The " \
                 "tool returns truncated:true when the cap fires.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        # query mode
        where:    { type: "object" },
        keys:     { type: "array", items: { type: "string" } },
        include:  { type: "array", items: { type: "string" } },
        order:    { type: "string" },
        limit:    { type: "integer" },
        skip:     { type: "integer" },
        # aggregate mode (mutually exclusive with where/keys/order/limit)
        pipeline: { type: "array", items: { type: "object" } },
        # output control
        columns:  {
          type: "array",
          # Each entry is either a string (used as both path and header) or a
          # single-entry { "<field>" => "<Header>" } object for renaming.
          # OpenAI rejects array properties without an `items` schema
          # (`invalid_function_parameters`: "array schema missing items.").
          items: {
            oneOf: [
              { type: "string" },
              { type: "object", additionalProperties: { type: "string" } },
            ],
          },
          description: "Column spec. Each entry is either a string (field name, used as header) " \
                       "or an object {field => header} to rename. Dotted paths supported.",
        },
        format:   {
          type: "string",
          enum: %w[csv markdown table],
          description: "Output format. Defaults to 'csv'.",
        },
        row_cap: {
          type: "integer",
          description: "Maximum rows in the formatted output. Defaults to 1000. " \
                       "When the underlying query/pipeline yields more, the result is truncated " \
                       "AND the response carries truncated:true plus available_rows:N.",
        },
      },
      required: ["class_name"],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        # `format` is one of csv|markdown|table; same shape as the
        # input enum.
        format:  { type: "string", enum: %w[csv markdown table] },
        headers: { type: "array", items: { type: "string" } },
        row_count: { type: "integer", minimum: 0 },
        # The serialized output is the formatted CSV / Markdown /
        # text-table string itself — clients render it as-is.
        output:  { type: "string" },
        truncated:      { type: "boolean" },
        available_rows: { type: "integer", minimum: 0 },
        row_cap:        { type: "integer", minimum: 1 },
        hint:           { type: "string" },
      },
      required: %w[class_name format headers row_count output],
    },
  },

  atlas_text_search: {
    category: "query",
    name: "atlas_text_search",
    description: "Run a MongoDB Atlas Search $search aggregation against a Parse class — full-text search " \
                 "with relevance scoring across one or more indexed fields. Use this instead of query_class " \
                 "when the user input is natural language, typo-prone, or when you need a relevance " \
                 "ranking that simple where: constraints cannot produce. Per-row ACL is enforced by an " \
                 "automatic _rperm $match injected after $search whenever the agent carries a session_token, " \
                 "acl_user, or acl_role scope; master-key agents skip the filter (intentional). " \
                 "Default limit 10, max 20. Results carry a score:; pass highlight_field: to also receive " \
                 "highlight snippets. Note: Atlas Search runs out-of-process from mongod and has " \
                 "sub-second indexing lag — do not use for read-your-own-writes.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        query:      { type: "string", description: "Search query text. Non-empty." },
        fields:     {
          type: "array",
          items: { type: "string" },
          description: "Optional. Restrict search to these fields. When omitted, all indexed fields are " \
                       "searched. Subject to the class's agent_fields allowlist when one is declared.",
        },
        limit:      {
          type: "integer",
          description: "Optional. Max results, default 10, hard cap 20.",
        },
        highlight_field: {
          type: "string",
          description: "Optional. Field to return highlight snippets for. Subject to agent_fields allowlist.",
        },
        filter:     {
          type: "object",
          description: "Optional. Additional MongoDB filter applied after the $search stage. Same security " \
                       "validation as aggregate's pipeline: no $where / $function / $accumulator.",
        },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, it is AND-merged into the " \
                                               "post-$search $match (alongside any caller filter:) " \
                                               "so search results come from the 'valid state' subset only." },
      },
      required: %w[class_name query],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        count:      { type: "integer", minimum: 0 },
        # Each row is a Parse object projected through the class's
        # agent_fields allowlist, with an Atlas-supplied `score`
        # numeric and an optional `highlights` array when the
        # caller passes highlight_field:. The row shape is class-
        # dependent so additionalProperties is open.
        results: {
          type: "array",
          items: {
            type: "object",
            properties: {
              score: { type: "number" },
              highlights: {
                type: "array",
                items: {
                  type: "object",
                  properties: {
                    path: { type: "string" },
                    texts: {
                      type: "array",
                      items: {
                        type: "object",
                        properties: {
                          value: { type: "string" },
                          type:  { type: "string", description: "'hit' or 'text' per Atlas spec." },
                        },
                        required: %w[value],
                      },
                    },
                  },
                  required: %w[path],
                },
              },
            },
            additionalProperties: true,
          },
        },
      },
      required: %w[class_name count results],
    },
  },

  atlas_autocomplete: {
    category: "query",
    name: "atlas_autocomplete",
    description: "Atlas Search autocomplete operator — search-as-you-type prefix matching against a single " \
                 "indexed field. Use when resolving partial user input to known entity names (e.g., " \
                 "song title, artist name) for disambiguation. Requires the autocomplete analyzer to be " \
                 "configured on the named field in the search index. Per-row ACL is enforced by an " \
                 "automatic _rperm $match whenever the agent carries a session_token, acl_user, or " \
                 "acl_role scope; master-key agents skip the filter (intentional). " \
                 "Default limit 10, max 20. Returns a list of distinct suggested field values plus the " \
                 "full matching Parse objects.",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        query:      { type: "string", description: "Prefix to autocomplete against. Non-empty." },
        field:      {
          type: "string",
          description: "Field name configured for autocomplete in the search index. Must be in " \
                       "agent_fields allowlist when one is declared.",
        },
        limit:      {
          type: "integer",
          description: "Optional. Max suggestions, default 10, hard cap 20.",
        },
        fuzzy:      {
          type: "boolean",
          description: "Optional. Enable single-edit fuzzy matching. Default false.",
        },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When true and the class declares an " \
                                               "agent_canonical_filter, it is applied as a post-$search " \
                                               "$match so autocomplete suggestions exclude 'invalid state' " \
                                               "rows that the rest of the read-tool surface hides." },
      },
      required: %w[class_name query field],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        field:      { type: "string" },
        # `suggestions` is the list of distinct field values that
        # matched the autocomplete query (deduped, ordered by Atlas
        # ranking). Strings only — autocomplete operates on text.
        suggestions: { type: "array", items: { type: "string" } },
        count:       { type: "integer", minimum: 0 },
        # Full matching Parse objects, projected through the class
        # agent_fields allowlist.
        results: {
          type: "array",
          items: { type: "object", additionalProperties: true },
        },
      },
      required: %w[class_name field suggestions count results],
    },
  },

  atlas_faceted_search: {
    category: "query",
    name: "atlas_faceted_search",
    description: "Atlas Search faceted query — returns bucket counts per facet alongside a list of " \
                 "matching documents. Use for UI-style faceted browsers or summary breakdowns of an " \
                 "indexed corpus. NOTE: $searchMeta cannot enforce per-row ACL at the bucket-count " \
                 "level; this tool refuses session-scoped calls. The agent must be constructed with " \
                 "master_atlas: true to use it. Default limit 10, max 20 for the result list. " \
                 "NOTE: this tool ALSO refuses when the class declares an `agent_canonical_filter` " \
                 "or when the agent has a per-class `filters:` entry for the class — bucket counts " \
                 "would leak the filtered rows. Use atlas_text_search instead (which applies both " \
                 "filters via $match), or pass `apply_canonical_filter: false` to acknowledge the " \
                 "leak risk for the canonical declaration (the per-agent filter cannot be opted out).",
    parameters: {
      type: "object",
      properties: {
        class_name: { type: "string" },
        query:      { type: "string", description: "Optional. Search query text; pass empty for match-all." },
        facets:     {
          type: "object",
          description: "Facet definitions keyed by facet name. Each value is " \
                       "{ type: 'string'|'number'|'date', path: <field>, boundaries?: [...] }. " \
                       "Paths must be in agent_fields allowlist when one is declared.",
        },
        limit:      {
          type: "integer",
          description: "Optional. Max documents in the result list (NOT bucket counts), default 10, max 20.",
        },
        apply_canonical_filter: { type: "boolean",
                                  description: "Default true. When the class declares an " \
                                               "agent_canonical_filter, this tool refuses by default. " \
                                               "Pass false to acknowledge that $searchMeta bucket counts " \
                                               "WILL include rows the canonical filter normally hides." },
      },
      required: %w[class_name facets],
    },
    output_schema: {
      type: "object",
      properties: {
        class_name:  { type: "string" },
        # $searchMeta lower-bound count. May be approximate for very
        # large corpora — Atlas documents this; downstream clients
        # should treat it as informative, not a precise total.
        total_count: { type: "integer", minimum: 0 },
        # Facets is a Map<facet_name, { buckets: [{_id, count}] }>.
        # Bucket _id is heterogeneous (String for string facets,
        # Number/Date for numeric/date facets), so additionalProperties:true
        # on the bucket entry keeps the contract honest without
        # bloating the schema with per-type variants.
        facets: {
          type: "object",
          additionalProperties: {
            type: "object",
            properties: {
              buckets: {
                type: "array",
                items: { type: "object", additionalProperties: true },
              },
            },
            required: %w[buckets],
          },
        },
        count:   { type: "integer", minimum: 0 },
        results: {
          type: "array",
          items: { type: "object", additionalProperties: true },
        },
      },
      required: %w[class_name total_count facets count results],
    },
  },
}.freeze
BUILTIN_CATEGORIES =

Human-readable one-liners for the built-in tool categories. Used by the list_tools discovery tool to summarize what an agent can do with each category. Application-registered tools use a custom category: value via Tools.register(category: "..."); the default for un-categorized registrations is "custom".

{
  "schema"    => "Class introspection — discover available classes, fields, indexes, and permissions.",
  "query"     => "Read-only data access — fetch records, counts, samples, and execution plans.",
  "aggregate" => "MongoDB aggregation pipelines for grouping, statistics, and joins.",
  "mutation"  => "Domain-action methods declared via agent_method.",
  "export"    => "Bulk data export in CSV, Markdown, or fixed-width text.",
  "custom"    => "Application-registered tools not assigned to a built-in category.",
}.freeze
CLASS_NAME_RE =

Parse class-name regex. Parse Server requires class names to start with a letter or underscore (system classes like _User, _Role, _Session, _Installation) and contain only ASCII alphanumeric plus underscore. Cap at 128 chars to match the dispatcher's resource-URI validator. Same shape as MCPDispatcher::IDENTIFIER_RE plus support for a leading underscore to allow system classes.

/\A[A-Za-z_][A-Za-z0-9_]{0,127}\z/.freeze
OBJECT_ID_RE =

Parse objectId: 10-char alphanumeric per the JS/iOS SDKs' offline-mode format and Parse Server's own server-generated shape. Accept length 1-32 to tolerate custom-id schemes some apps use.

/\A[A-Za-z0-9_-]{1,32}\z/.freeze
METHOD_NAME_RE =

agent_method names. Same shape as Ruby method names with a length cap.

/\A[A-Za-z_][A-Za-z0-9_]{0,127}[!?=]?\z/.freeze
CALL_METHOD_DENIED_KEYS =

Keys that are NEVER permitted as +arguments+ to +call_method+, regardless of the method's declared +permitted_keys+. These reference Parse-internal columns (auth/credential state, ACL, identifiers managed by Parse Server) and have no legitimate use case being set by an LLM through a wrapper method that splats its arguments with +**+.

%i[
  _hashed_password _password_history
  authData auth_data _auth_data
  sessionToken session_token _session_token
  ACL acl _rperm _wperm
  _perishable_token _email_verify_token
  objectId id
  createdAt created_at updatedAt updated_at
  className __type
].freeze
AGENT_METHOD_RESERVED_ARG_KEYS =

Framework-reserved keys that are always permitted in +arguments+ regardless of the method's +permitted_keys+ list. +dry_run+ is gated separately by +supports_dry_run+ on the method definition. Reserved kwargs the agent dispatcher may inject into an +agent_method+ body. +dry_run+ is forwarded when the method declared +supports_dry_run: true+. +agent+ is injected whenever the method's signature declares the keyword (so the author can read +agent.acl_scope_kwargs+ / +agent.acl_scope+ and apply the agent's identity to internal queries the method runs). Neither key counts against +permitted_keys:+.

%i[dry_run agent].freeze
ALLOWLIST_PREVIEW_CAP =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Build a structured refusal message for an allowlist violation. The message includes:

  • the stage context (e.g., "field", "sort field", "match field")
  • the offending name and its first path segment
  • the actual allowlist for the class (capped at 20 names so the message stays compact on wide schemas)
  • a one-shot rewrite hint when the violation looks like a Parse-on-Mongo storage-form reference ($_p_foo). The mongo storage column is the implementation detail; the bare pointer field name ($foo) is what the agent should reference.
20
POINTER_STORAGE_VALUE_RE =

Parse-on-Mongo pointer column shape: a string value paired with a _p_<field> column whose value matches <ClassName>$<objectId>. Raw aggregate results expose this storage form unchanged. The standard walk_and_redact matches hash shapes with className, so the string form bypasses redaction unless we look for it explicitly.

/\A([A-Za-z_][A-Za-z0-9_]*)\$(.+)\z/.freeze
QUERY_CLASS_FORMATS =

Query objects from a Parse class

Valid formats for the +format:+ kwarg on query_class. "json" is the default and returns the structured row envelope; the others delegate to the export_data formatters so the conversational query path can produce a CSV/Markdown/text-table dump without round-tripping through a separate tool.

Returns:

  • (Hash)

    query results, or a refusal hash if COLLSCAN detected

Raises:

%w[json csv markdown table].freeze
AGGREGATE_DEFAULT_LIMIT =

Run an aggregation pipeline

Default safety cap for aggregation result rows when the caller's pipeline doesn't already terminate in a $limit or $count stage. aggregate injects an auto-$limit at this size so a $group over a high-cardinality field can't fire back tens of thousands of rows to the LLM. Conservatively sized for chat-context safety.

Returns:

  • (Hash)

    aggregation results, or a refusal hash if COLLSCAN detected

Raises:

200
AGGREGATE_DEFAULT_MONGO_DIRECT =

Default routing for #aggregate: when true, the agent tool sends the assembled pipeline through MongoDB.aggregate (direct MongoDB) and applies the direct-MongoDB field-reference rewriter so $author reaches $_p_author. When false, the pipeline goes to the Parse Server REST aggregate endpoint and field translation is whatever the server provides. The toggle is plumbed per-call via the mongo_direct: kwarg; the default lives here so a deployment without MongoDB configured still has a single switch to flip back.

true
GROUP_DEFAULT_LIMIT =

Per-tool result caps. Distinct often spans more values than a grouped count (think tags, statuses, customer ids), so it gets a larger ceiling while group_by/group_by_date stay aligned with AGGREGATE_DEFAULT_LIMIT for context-safety.

200
GROUP_MAX_LIMIT =
1000
DISTINCT_DEFAULT_LIMIT =
1000
DISTINCT_MAX_LIMIT =
5000
GROUP_OPERATIONS =

Supported aggregation operations for group_by / group_by_date. Maps the LLM-facing name to the MongoDB accumulator operator.

{
  "count"   => "$sum",   # value_field is ignored; accumulator is { $sum: 1 }
  "sum"     => "$sum",
  "avg"     => "$avg",
  "average" => "$avg",
  "min"     => "$min",
  "max"     => "$max",
}.freeze
GROUP_DATE_INTERVALS =
%w[year month week day hour minute second].freeze
DEFAULT_EXPORT_ROW_CAP =

Export Parse data as CSV, Markdown table, or fixed-width text table.

Two modes:

  • Query mode (default): pass where:, keys:, include:, order:, limit:. Underlying call is the same as query_class so every access-control gate (agent_hidden, agent_fields allowlist, include-path resolution, post-fetch redaction) applies.
  • Aggregate mode: pass pipeline: (mutually exclusive with the query-mode args). Underlying call is the same as aggregate so the pipeline access policy walker ($lookup into hidden classes, field- level allowlist on $project/$addFields etc.) applies.

Column control:

  • columns: is an ordered array of column specs. Each spec is either a String (field name, used as the header) or a Hash {field => header} to alias. Dotted paths ("subject.name") extract nested values from include-resolved pointer fields.
  • When columns: is nil, headers are inferred from the first row's keys (excluding Parse-internal fields like __type).

Output:

  • format: "csv" (default) — RFC 4180 CSV via Ruby's stdlib csv.
  • format: "markdown" — GFM-style pipe table with --- separator row.
  • format: "table" — fixed-width ASCII table with +---+ borders.

Default cap on rows in the formatted output. Sized so that even a wide-schema CSV (10-15 columns) stays under ~80 KB / ~20k tokens — large enough for a sensible export, small enough to not dominate an LLM context window. Override via the row_cap: parameter up to MAX_EXPORT_ROW_CAP.

Returns:

  • (Hash)

    { class_name:, format:, headers:, row_count:, output: "..." }

1_000
MAX_EXPORT_ROW_CAP =

Hard ceiling on row_cap regardless of caller override. Past this point the LLM should be calling a different surface (operator-run rake mcp:tool[export_data,...], application-level DB export, etc). Even at this ceiling the dispatcher's MAX_TOOL_RESPONSE_BYTES (4 MiB) may still trim oversized output.

10_000
EXPORT_DENIED_COLUMN_PREFIXES =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Field paths that may never appear in an export, regardless of any per-class allowlist. Closes NEW-TOOLS-8: a caller that paired columns: ["_session_token"] with keys: [...] would extract internal Parse-Server fields directly through the export pipeline, bypassing the skip-list in infer_export_columns_from which only filters fields that were NOT explicitly requested.

%w[
  _hashed_password _session_token _perishable_token _email_verify_token
  _email_verify_token_expires_at _password_history bcryptPassword
  authData _rperm _wperm ACL _account_lockout_expires_at
].freeze
EXPORT_COLUMN_PATH_RE =

Identifier pattern shared with validate_keys! / validate_include!. Enforces letter-prefixed dotted paths so the wire format passes through Parse Server's projector cleanly.

/\A[A-Za-z][A-Za-z0-9_.]{0,127}\z/.freeze

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.all_tool_namesArray<Symbol>

Returns all tool names: builtins + registered.

Returns:



1300
1301
1302
1303
1304
# File 'lib/parse/agent/tools.rb', line 1300

def all_tool_names
  builtin = TOOL_DEFINITIONS.keys
  registered = REGISTRY_MUTEX.synchronize { @registry.keys }
  (builtin + registered).uniq
end

.append_sort_limit!(pipeline, sort_choice:, cap:, default_sort: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Append wire-side $sort and $limit stages so the database — not Ruby — does the truncation. Without this, a group_by over a high-cardinality field returns every group over the wire before Ruby truncates, blowing up both bandwidth and tool-response budgets.

We append $limit at cap + 1 so the handler can still detect truncation server-side (received cap+1 rows => more existed). Sort vocabulary mirrors the LLM-facing surface: value_desc/value_asc — sort by the $group accumulator output key_desc/key_asc — sort by the $group _id default_sort applies when the caller didn't specify one (used by group_by_date for chronological "key_asc").



4288
4289
4290
4291
4292
4293
4294
4295
4296
4297
# File 'lib/parse/agent/tools.rb', line 4288

def append_sort_limit!(pipeline, sort_choice:, cap:, default_sort: nil)
  effective_sort = sort_choice || default_sort
  if effective_sort
    direction = effective_sort.end_with?("_desc") ? -1 : 1
    key       = effective_sort.start_with?("value") ? "value" : "_id"
    pipeline << { "$sort" => { key => direction } }
  end
  pipeline << { "$limit" => cap + 1 }
  pipeline
end

.apply_canonical_filter_to_pipeline(pipeline, class_name, agent: nil) ⇒ Object

Prepend the class's canonical filter as a $match stage. Inserts AFTER any leading tenant-scope $match so tenant isolation stays at index 0 for auditability. TRACK-AGENT-7 split — the per-agent half now lives in apply_per_agent_filter_to_pipeline.



1793
1794
1795
1796
# File 'lib/parse/agent/tools.rb', line 1793

def apply_canonical_filter_to_pipeline(pipeline, class_name, agent: nil)
  canonical = Parse::Agent::MetadataRegistry.canonical_filter(class_name)
  compose_filter_into_pipeline(pipeline, canonical)
end

.apply_canonical_filter_to_where(where, class_name, agent: nil) ⇒ Object

Merge the class's canonical "valid state" filter (declared via agent_canonical_filter) into the caller's where expression. This is per-CLASS metadata — the LLM-facing kwarg apply_canonical_filter: (default true) controls whether this helper runs.

IMPORTANT: TRACK-AGENT-7 split — historically this helper ALSO applied the per-agent filter, and a passed apply_canonical_filter: false would silently drop both. The per-agent half now lives in apply_per_agent_filter_to_where which is called UNCONDITIONALLY by every tool. Do not re-collapse them.



1744
1745
1746
1747
# File 'lib/parse/agent/tools.rb', line 1744

def apply_canonical_filter_to_where(where, class_name, agent: nil)
  canonical = Parse::Agent::MetadataRegistry.canonical_filter(class_name)
  compose_filter_into_where(where, canonical, class_name, helper_name: "apply_canonical_filter_to_where")
end

.apply_include_projection(class_name, keys, include_arr) ⇒ Hash

Auto-projection for keys: + include:. When the caller passed a keys: projection AND named a bare pointer field in both keys: and include:, expand keys to dotted-path projections of the joined class so Parse Server returns only the projected subfields of the included record instead of the entire row.

The expansion is suppressed when the caller passed any <pointer>.* dotted path of their own — that's the explicit "I named exactly what I want" signal. It's also suppressed when no keys: was passed at all (caller chose full-row mode) and when the include is multi-hop (author.workspace); auto-projection is one-hop only.

Pointer-to-joined-class resolution uses the parent class's references reflection (same source walk_pointer_path! uses), accepting both snake_case Ruby names and camelCase wire names so callers can pass either form in include:.

Parameters:

  • class_name (String)

    the parent Parse class name

  • keys (Array<String>, nil)

    caller-supplied keys (post-validation)

  • include_arr (Array<String>, nil)

    caller-supplied include (post-validation)

Returns:

  • (Hash)

    Array or nil, truncated: Hash effective_keys is the rewritten keys array (or the input verbatim if no expansion fired). truncated is a hash keyed by pointer wire-name with metadata {dropped:, source:} per expanded pointer — used to populate the response envelope's truncated_include_fields so the LLM can see which joins were auto-projected and re-ask via dotted paths if it needs more.



2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
# File 'lib/parse/agent/tools.rb', line 2486

def apply_include_projection(class_name, keys, include_arr)
  result = { effective_keys: keys, truncated: {} }
  return result if keys.nil? || keys.empty?
  return result if include_arr.nil? || include_arr.empty?

  parent_klass = begin
      Parse::Model.find_class(class_name.to_s)
    rescue StandardError
      nil
    end
  return result unless parent_klass.respond_to?(:references)

  keys_str = keys.map(&:to_s)
  # Index dotted paths per top-level segment so we can detect
  # caller-explicit per-pointer projection ("user.iconImage" suppresses
  # auto-expansion for `user`).
  caller_dotted = keys_str.each_with_object(Hash.new { |h, k| h[k] = [] }) do |k, h|
    next unless k.include?(".")
    root, = k.split(".", 2)
    h[root] << k
  end

  # Only consider one-hop includes; multi-hop (`author.workspace`) leaves
  # the author projection alone and lets the deeper hop materialize
  # fully — keeps the auto-expansion bounded and avoids walking the
  # full relation graph at query time.
  single_hop_includes = include_arr.map(&:to_s).reject { |p| p.include?(".") }
  return result if single_hop_includes.empty?

  appended = []
  truncated = {}

  single_hop_includes.each do |pointer_field|
    ptr_str = pointer_field.to_s
    # Caller named the pointer in `keys:` — otherwise the auto-projection
    # has nothing to attach to. (`include: ["user"]` without `keys:
    # ["user"]` is caller error; Parse Server wouldn't return the
    # pointer column at all in that case.)
    next unless keys_str.include?(ptr_str)
    # Caller passed `<pointer>.something` — explicit projection mode,
    # leave their dotted paths alone, do not auto-expand.
    next if caller_dotted[ptr_str].any?

    target_class = resolve_pointer_target(parent_klass, ptr_str)
    next unless target_class

    projection = Parse::Agent::MetadataRegistry.join_projection_fields(target_class)
    next unless projection

    # Build dotted-path projections; dedupe to avoid wire bloat if the
    # caller already named some of these (e.g. they passed `user.email`
    # earlier but we just rejected the auto-expansion — defensive
    # belt-and-suspenders since the rejection branch above also short-
    # circuits).
    new_paths = projection[:project].map { |field| "#{ptr_str}.#{field}" }
    appended.concat(new_paths)
    truncated[ptr_str] = {
      dropped: projection[:dropped],
      source: projection[:source],
    }
  end

  return result if appended.empty?

  effective = (keys_str | appended)
  { effective_keys: effective, truncated: truncated }
end

.apply_per_agent_filter_to_pipeline(pipeline, class_name, agent: nil) ⇒ Object

Prepend the per-agent per-class filter (declared via Parse::Agent.new(filters: { Class => {...} })) as a $match stage. UNCONDITIONAL — no LLM kwarg can disable it. Inserts AFTER any leading tenant-scope $match so tenant isolation stays at index 0 for auditability.



1782
1783
1784
1785
# File 'lib/parse/agent/tools.rb', line 1782

def apply_per_agent_filter_to_pipeline(pipeline, class_name, agent: nil)
  per_agent = agent && agent.respond_to?(:filter_for) ? agent.filter_for(class_name) : nil
  compose_filter_into_pipeline(pipeline, per_agent)
end

.apply_per_agent_filter_to_where(where, class_name, agent: nil) ⇒ Object

Merge the per-agent per-class filter (declared via Parse::Agent.new(filters: { Class => {...} })) into the caller's where expression. This is the OPERATOR's per-agent scoping and is UNCONDITIONAL — no LLM-controllable kwarg can disable it. The caller's constraints compose with the per-agent filter via a top-level $and.

Returns the where unchanged when no per-agent filter applies.



1726
1727
1728
1729
# File 'lib/parse/agent/tools.rb', line 1726

def apply_per_agent_filter_to_where(where, class_name, agent: nil)
  per_agent = agent && agent.respond_to?(:filter_for) ? agent.filter_for(class_name) : nil
  compose_filter_into_where(where, per_agent, class_name, helper_name: "apply_per_agent_filter_to_where")
end

.apply_tenant_scope_to_pipeline(pipeline, scope) ⇒ Array<Hash>

Prepend a $match tenant scope stage at index 0 of an aggregation pipeline. Always prepends unconditionally — pipeline-level interaction with existing $match stages is safe because any later stage with the wrong tenant produces empty results, not cross-tenant leak.

Parameters:

  • pipeline (Array<Hash>)
  • scope (Hash, nil)

    { field: Symbol, value: Object }

Returns:



1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
# File 'lib/parse/agent/tools.rb', line 1681

def apply_tenant_scope_to_pipeline(pipeline, scope)
  return pipeline unless scope
  # MongoDB aggregation pipelines use camelCase field names (same wire format
  # as Parse Server's stored field names). Camelize the scope field so the
  # prepended $match stage is structurally equivalent to what the LLM would
  # write when querying by the field directly.
  field     = scope[:field].to_s
  wire_key  = field.gsub(/_([a-z])/) { Regexp.last_match(1).upcase }
  [{ "$match" => { wire_key => scope[:value] } }] + pipeline
end

.apply_tenant_scope_to_where(where, scope, class_name) ⇒ Hash?

Merge tenant scope into a caller-supplied where: hash (or nil).

Three outcomes:

  1. Scope field not present in caller where → inject { field => value }.
  2. Scope field present with matching value (string or symbol key) → pass through.
  3. Scope field present with any other value → refuse with AccessDenied (caller is attempting to spoof the tenant filter).

Returns the effective merged where hash (or nil if no scope and no where).

Parameters:

  • where (Hash, nil)

    caller-supplied constraints

  • scope (Hash, nil)

    { field: Symbol, value: Object } from resolve_tenant_scope!

  • class_name (String)

    for the AccessDenied message

Returns:

Raises:



1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
# File 'lib/parse/agent/tools.rb', line 1635

def apply_tenant_scope_to_where(where, scope, class_name)
  return where unless scope
  field     = scope[:field]
  value     = scope[:value]
  field_str = field.to_s
  field_sym = field.to_sym
  # Also check the camelCase wire form (e.g. org_id -> orgId) because an LLM
  # may pass the field using the Parse Server wire-format key rather than the
  # Ruby snake_case name.  Without this check a caller could pass
  # where: {"orgId" => "evil"} which would be silently treated as case-1
  # (absent) and the injected snake_case key would allow both keys to reach
  # ConstraintTranslator, potentially letting the camelCase value win.
  camel_str = field_str.gsub(/_([a-z])/) { Regexp.last_match(1).upcase }
  camel_sym = camel_str.to_sym

  where_h = (where || {})
  # Collect candidate values from all four key forms (snake str/sym, camel str/sym).
  candidate_keys   = [field_str, field_sym, camel_str, camel_sym]
  present_keys     = candidate_keys.select { |k| where_h.key?(k) }
  caller_value     = present_keys.any? ? where_h[present_keys.first] : nil
  field_present    = present_keys.any?

  if !field_present
    # Case 1: field absent in any form — inject using snake_case string key.
    (where_h.dup).tap { |h| h[field_str] = value }
  elsif caller_value == value
    # Case 2: field present with the correct tenant value — pass through.
    where_h
  else
    # Case 3: field present with different value (or operator hash) — refuse.
    raise Parse::Agent::AccessDenied.new(
      class_name,
      "Tenant scope conflict: cannot override '#{field_str}' constraint for class '#{class_name}'",
    )
  end
end

.assert_class_accessible!(class_name, agent: nil, op: nil) ⇒ Object

Refuse access to a class marked agent_hidden, AND validate that the supplied class_name has a well-formed identifier shape. Raised at every tool entry that accepts a class_name argument so an LLM cannot bypass the catalog filter by naming a hidden class directly OR smuggle malformed identifiers ("_User?foo=bar", "Foo; DROP TABLE") through to Parse Server's URL path. The exception is caught by Parse::Agent#execute and translated to a sanitized error_response.

Parameters:

  • agent (Parse::Agent, nil) (defaults to: nil)

    when present, the agent's auth context is consulted to honor agent_hidden(except: :master_key) declarations. A class declared with agent_hidden(except: :master_key) is reachable by master-key agents (no session_token binding) but refused for session-bound ones. Passing nil falls back to the strict "every hidden class is denied" behavior, used at sites where no agent is in scope (e.g. registry introspection). Callers must propagate agent: to preserve the except-scope; the field-level INTERNAL_FIELDS_DENYLIST floor applies regardless of this gate.



1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
# File 'lib/parse/agent/tools.rb', line 1510

def assert_class_accessible!(class_name, agent: nil, op: nil)
  return if class_name.nil? || class_name.to_s.empty?

  # NEW-TOOLS-9: identifier-format check at the boundary.
  unless CLASS_NAME_RE.match?(class_name.to_s)
    raise Parse::Agent::ValidationError,
          "class_name #{class_name.inspect} is not a valid identifier. " \
          "Must start with a letter or underscore and contain only " \
          "letters, digits, and underscores (max 128 chars)."
  end

  if Parse::Agent::MetadataRegistry.hidden?(class_name)
    # `agent_hidden(except: :master_key)` — permit only when this agent
    # actually runs under master key. A session-bound agent is refused
    # regardless of operator intent; this is the "user-facing MCP never
    # sees Session, dev-MCP can" axis. The gate keys on
    # auth_context[:using_master_key], NOT on session_token emptiness —
    # an acl_user / acl_role agent also has an empty session_token but
    # is NOT master-key; using the old empty-string test would have
    # silently elevated those scoped agents past the gate.
    except = Parse::Agent::MetadataRegistry.hidden_exception_for(class_name)
    using_master_key = agent && agent.respond_to?(:auth_context) &&
                       agent.auth_context[:using_master_key] == true
    unless except == :master_key && using_master_key
      raise Parse::Agent::AccessDenied.new(class_name)
    end
  end

  # Per-agent `classes:` allowlist — deny wins, allowlist is the ceiling
  # not the floor. The global hidden gate above has already been
  # consulted (including the master-key-except bypass). This check
  # additionally enforces the operator's per-agent narrowing — a class
  # the global registry would permit is still refused if outside the
  # agent's `only:` set or inside its `except:` set. The carved-out
  # `kind: :class_filter` lets SOC tooling distinguish operator
  # narrowing from policy-level denials.
  if agent && !agent.class_filter_permits?(class_name)
    raise Parse::Agent::AccessDenied.new(
      class_name,
      "Class '#{class_name}' is outside this agent's classes: allowlist",
      kind: :class_filter
    )
  end

  # Class-Level Permissions (CLP) gate. When the agent declares a
  # scope (session_token / acl_user / acl_role) and the tool
  # specifies which CLP operation it represents, refuse at the
  # boundary if the class's CLP doesn't grant that op to any
  # entry in the agent's claim set. Master-key posture
  # (`acl_permission_strings` is nil) bypasses — same contract
  # as Parse::ACLScope. Lookup failures fail-open (Parse Server
  # still enforces CLP on its own REST surface).
  if op && agent && agent.respond_to?(:acl_permission_strings)
    perms = agent.acl_permission_strings
    unless Parse::CLPScope.permits?(class_name.to_s, op, perms)
      raise Parse::Agent::AccessDenied.new(
        class_name,
        "Class '#{class_name}' CLP refuses #{op} for the agent's scope.",
        kind: :clp_denied,
      )
    end
  end
end

.assert_fields_in_allowlist!(class_name, fields) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Verify each referenced field is within agent_fields (or the always-keep set) when an allowlist is declared on the class.



4041
4042
4043
4044
4045
4046
4047
4048
4049
4050
4051
4052
4053
4054
# File 'lib/parse/agent/tools.rb', line 4041

def assert_fields_in_allowlist!(class_name, fields)
  allowlist = MetadataRegistry.field_allowlist(class_name)
  return if allowlist.nil? || allowlist.empty?
  permitted = allowlist.map(&:to_s) | MetadataRegistry::ALWAYS_KEEP_FIELDS
  Array(fields).each do |raw|
    root = raw.to_s.sub(/\A_p_/, "").split(".").first
    next if root.nil? || root.empty?
    unless permitted.include?(root)
      raise Parse::Agent::AccessDenied.new(
        build_allowlist_refusal("field", raw.to_s, root, permitted),
      )
    end
  end
end

.assert_include_paths_accessible!(class_name, include_paths, agent: nil) ⇒ Object

Resolve each dotted include: path through belongs_to / has_one reflections starting from class_name and refuse any path whose terminal target is a hidden class. Best-effort: paths that don't resolve through declared references (e.g., free-form server-side names) fall through; the post-fetch redact_hidden_classes! walker provides defense-in-depth for those.



2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
# File 'lib/parse/agent/tools.rb', line 2443

def assert_include_paths_accessible!(class_name, include_paths, agent: nil)
  return if include_paths.nil? || include_paths.empty?
  return unless class_name
  klass = begin
      Parse::Model.find_class(class_name.to_s)
    rescue StandardError
      nil
    end
  return unless klass.respond_to?(:references)

  include_paths.each do |path|
    walk_pointer_path!(klass, path.to_s.split("."), agent: agent)
  end
end

.assert_method_name!(method_name) ⇒ Object

NEW-TOOLS-9: validate a method_name argument format.



1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
# File 'lib/parse/agent/tools.rb', line 1591

def assert_method_name!(method_name)
  return if method_name.nil?
  s = method_name.to_s
  return if s.empty?
  unless METHOD_NAME_RE.match?(s)
    raise Parse::Agent::ValidationError,
          "method_name #{method_name.inspect} is not a valid identifier. " \
          "Must start with a letter or underscore and contain only " \
          "letters, digits, and underscores (max 128 chars, optional !/?/= suffix)."
  end
end

.assert_object_id!(object_id) ⇒ Object

NEW-TOOLS-9: validate an object_id argument format. objectIds in Parse are usually 10-char alphanumeric; some apps use custom-id schemes so we accept up to 32 chars with the broader URL-safe character set.



1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
# File 'lib/parse/agent/tools.rb', line 1578

def assert_object_id!(object_id)
  return if object_id.nil?
  s = object_id.to_s
  return if s.empty?  # downstream tools enforce required-ness separately
  unless OBJECT_ID_RE.match?(s)
    raise Parse::Agent::ValidationError,
          "object_id #{object_id.inspect} is not a valid identifier. " \
          "Must be 1-32 characters of letters, digits, hyphens, or underscores."
  end
end

.assert_output_key_not_internal!(field_name, stage) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Refuse output keys mirroring internal Parse Server columns. Sources the superset denylist from PipelineSecurity (which already covers ConstraintTranslator's where-key denylist plus the extra _tombstone/sessionToken/session_token shapes).



2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
# File 'lib/parse/agent/tools.rb', line 2095

def assert_output_key_not_internal!(field_name, stage)
  fname = field_name.to_s
  if Parse::PipelineSecurity::INTERNAL_FIELDS_DENYLIST.include?(fname) ||
     Parse::PipelineSecurity::INTERNAL_FIELDS_PREFIX_DENYLIST.any? { |p| fname.start_with?(p) }
    raise Parse::Agent::AccessDenied.new(
      nil,
      "#{stage} output key '#{fname}' mirrors an internal Parse Server column",
      kind: :denied_field_key,
      denied_field: fname,
    )
  end
end

.assert_record_in_tenant_scope!(record, scope, class_name) ⇒ Object

Verify that a fetched record's scope field matches the bound scope value. A missing field is treated as a mismatch — not a pass-through.

Parameters:

  • record (Hash)

    a raw result hash from Parse

  • scope (Hash, nil)

    { field: Symbol, value: Object }

  • class_name (String)

    for the AccessDenied message

Raises:



1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
# File 'lib/parse/agent/tools.rb', line 1821

def assert_record_in_tenant_scope!(record, scope, class_name)
  return unless scope
  field     = scope[:field].to_s
  value     = scope[:value]
  # Parse Server returns camelCase field names on the wire (e.g. orgId for
  # the Ruby field org_id). Check both forms so the gate works regardless
  # of whether the record came from a real server or a test fake.
  camel_field = field.gsub(/_([a-z])/) { Regexp.last_match(1).upcase }
  rec_value   = if record.is_a?(Hash)
      record.key?(field) ? record[field] : record[camel_field]
    end
  unless rec_value == value
    raise Parse::Agent::AccessDenied.new(
      class_name,
      "Object does not belong to the agent's tenant scope for class '#{class_name}'",
    )
  end
end

.assert_where_fields_in_allowlist!(class_name, where) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Walk the where: hash and refuse any top-level key (or nested $and/$or/$nor child key) outside the agent_fields allowlist. Closes the same oracle hole that the aggregate pipeline walker closes for $match — see check_match_keys_for_restricted_fields!.



4029
4030
4031
4032
4033
4034
4035
# File 'lib/parse/agent/tools.rb', line 4029

def assert_where_fields_in_allowlist!(class_name, where)
  return unless where.is_a?(Hash) && !where.empty?
  allowlist = MetadataRegistry.field_allowlist(class_name)
  return if allowlist.nil? || allowlist.empty?
  permitted = allowlist.map(&:to_s) | MetadataRegistry::ALWAYS_KEEP_FIELDS
  check_match_keys_for_restricted_fields!(where, permitted)
end

.atlas_autocomplete(agent, class_name:, query:, field:, limit: nil, fuzzy: nil, apply_canonical_filter: true, **_kwargs) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



5431
5432
5433
5434
5435
5436
5437
5438
5439
5440
5441
5442
5443
5444
5445
5446
5447
5448
5449
5450
5451
5452
5453
5454
5455
5456
5457
5458
5459
5460
5461
# File 'lib/parse/agent/tools.rb', line 5431

def atlas_autocomplete(agent, class_name:, query:, field:, limit: nil, fuzzy: nil,
                       apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  unless query.is_a?(String) && !query.strip.empty?
    raise Parse::Agent::ValidationError, "query must be a non-empty string"
  end
  unless field.is_a?(String) || field.is_a?(Symbol)
    raise Parse::Agent::ValidationError, "field must be a String or Symbol"
  end
  assert_atlas_field_allowed!(class_name, field, kind: :field)
  limit = clamp_atlas_limit(limit)
  auth = atlas_auth_options!(agent, tool: :atlas_autocomplete)

  # TRACK-AGENT-6 / TRACK-AGENT-7 fix: weave per-agent (UNCONDITIONAL)
  # and canonical (LLM-controllable) filters into the existing
  # `filter:` channel — Parse::AtlasSearch.autocomplete emits it as
  # a $match stage AFTER the ACL $match (see atlas_search.rb:385-388).
  effective_filter = compose_atlas_filter(
    nil, class_name, agent: agent,
    apply_canonical_filter: apply_canonical_filter,
  )

  opts = { limit: limit }.merge(auth)
  opts[:fuzzy] = fuzzy if fuzzy
  opts[:filter] = effective_filter if effective_filter

  with_timeout(:atlas_autocomplete) do
    result = invoke_atlas_search(:autocomplete, class_name, query, opts, field: field.to_s)
    format_atlas_autocomplete_results(class_name, field.to_s, result)
  end
end

.atlas_faceted_search(agent, class_name:, facets:, query: "", limit: nil, apply_canonical_filter: true, **_kwargs) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



5464
5465
5466
5467
5468
5469
5470
5471
5472
5473
5474
5475
5476
5477
5478
5479
5480
5481
5482
5483
5484
5485
5486
5487
5488
5489
5490
5491
5492
5493
5494
5495
5496
5497
5498
5499
5500
5501
5502
5503
5504
5505
5506
5507
5508
5509
5510
5511
5512
5513
5514
5515
5516
5517
5518
5519
5520
5521
5522
5523
5524
# File 'lib/parse/agent/tools.rb', line 5464

def atlas_faceted_search(agent, class_name:, facets:, query: "", limit: nil,
                         apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  # Faceted Atlas Search cannot ACL-filter $searchMeta bucket
  # counts (see Parse::AtlasSearch::FacetedSearchNotACLSafe), so
  # this tool requires the explicit master_atlas: true opt-in
  # even when the agent has a session token. A session-bound
  # agent that legitimately needs facets should fall back to
  # multiple atlas_text_search calls with explicit filters.
  unless agent.respond_to?(:master_atlas?) && agent.master_atlas?
    raise Parse::Agent::ValidationError,
          "Tool 'atlas_faceted_search' requires the agent to be constructed " \
          "with master_atlas: true. $searchMeta bucket counts cannot enforce " \
          "per-row ACL; this tool refuses session-scoped calls so per-user " \
          "bucket leakage is impossible."
  end

  # TRACK-AGENT-6 / TRACK-AGENT-7 fix: $searchMeta bucket counts
  # cannot be filtered through a post-search $match because the
  # documents are not in the output stream — the SAME structural
  # constraint that makes session-token ACL unsafe also makes the
  # per-agent / canonical filter unsafe for buckets. Fail-closed:
  # refuse the call entirely when either filter is declared on
  # the class AND the LLM has not opted out of the canonical
  # filter. Bucket counts would otherwise leak hidden / archived
  # rows that every other read tool excludes.
  per_agent = agent && agent.respond_to?(:filter_for) ? agent.filter_for(class_name) : nil
  canonical = Parse::Agent::MetadataRegistry.canonical_filter(class_name)
  active_filters = []
  active_filters << "per-agent filter (filters: kwarg)" if per_agent && !per_agent.empty?
  active_filters << "canonical filter (agent_canonical_filter)" if apply_canonical_filter && canonical && !canonical.empty?
  unless active_filters.empty?
    raise Parse::Agent::AccessDenied.new(
      class_name,
      "atlas_faceted_search cannot enforce #{active_filters.join(' / ')} on " \
      "$searchMeta bucket counts (the matched documents are not in the output " \
      "stream). Use atlas_text_search (which applies these filters via $match) " \
      "or pass apply_canonical_filter: false (and arrange your agent without a " \
      "per-class filter for #{class_name}) to acknowledge the bucket-count leak risk.",
      kind: :atlas_facet_filter_unsafe,
    )
  end

  unless facets.is_a?(Hash) && !facets.empty?
    raise Parse::Agent::ValidationError, "facets must be a non-empty Hash"
  end
  normalize_atlas_facet_paths!(class_name, facets)
  limit = clamp_atlas_limit(limit)

  with_timeout(:atlas_faceted_search) do
    # faceted_search refuses session_token: outright (see
    # Parse::AtlasSearch::FacetedSearchNotACLSafe); pass master:
    # true unconditionally here, since the agent-level gate
    # above already enforced master_atlas?.
    result = Parse::AtlasSearch.faceted_search(
      class_name, query.to_s, facets,
      limit: limit, master: true,
    )
    format_atlas_faceted_results(class_name, result)
  end
end

.atlas_text_search(agent, class_name:, query:, fields: nil, limit: nil, highlight_field: nil, filter: nil, apply_canonical_filter: true, **_kwargs) ⇒ Object

The Tools module declares private at the top of its helper section (above), so module-level methods defined past that point are private instance methods by default. Each Atlas Search entry point is re-exposed as a module function below so the dispatcher (Tools.atlas_text_search(...) etc.) and the test suite can call it directly.



5393
5394
5395
5396
5397
5398
5399
5400
5401
5402
5403
5404
5405
5406
5407
5408
5409
5410
5411
5412
5413
5414
5415
5416
5417
5418
5419
5420
5421
5422
5423
5424
5425
5426
5427
5428
# File 'lib/parse/agent/tools.rb', line 5393

def atlas_text_search(agent, class_name:, query:, fields: nil, limit: nil,
                             highlight_field: nil, filter: nil,
                             apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  unless query.is_a?(String) && !query.strip.empty?
    raise Parse::Agent::ValidationError, "query must be a non-empty string"
  end
  if highlight_field
    assert_atlas_field_allowed!(class_name, highlight_field, kind: :highlight_field)
  end
  limit = clamp_atlas_limit(limit)
  auth = atlas_auth_options!(agent, tool: :atlas_text_search)
  fields_norm = normalize_atlas_fields_with_allowlist!(class_name, fields)

  # TRACK-AGENT-6 / TRACK-AGENT-7 fix: per-agent filter is
  # UNCONDITIONAL; canonical filter is LLM-controllable via
  # apply_canonical_filter:. Both compose into the existing
  # `filter:` channel that Parse::AtlasSearch.search emits as
  # a post-$search $match stage (AFTER the ACL $match). The
  # caller's filter, when supplied, AND-merges so neither
  # half can shadow the other.
  effective_filter = compose_atlas_filter(
    filter, class_name, agent: agent,
    apply_canonical_filter: apply_canonical_filter,
  )

  opts = { limit: limit }.merge(auth)
  opts[:fields] = fields_norm if fields_norm
  opts[:filter] = effective_filter if effective_filter
  opts[:highlight_field] = highlight_field if highlight_field

  with_timeout(:atlas_text_search) do
    result = invoke_atlas_search(:search, class_name, query, opts)
    format_atlas_text_search_results(class_name, result, highlight_field: highlight_field)
  end
end

.build_allowlist_refusal(context, fname, root, permitted_fields) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Build the prose refusal message AND the structured fields the exception carries (kind, denied_field, allowed_fields, suggested_rewrite). The structured fields land in the response envelope under details: so MCP consumers can branch on the specific subcode without parsing prose.



2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
# File 'lib/parse/agent/tools.rb', line 2391

def build_allowlist_refusal(context, fname, root, permitted_fields)
  msg = +"#{context} '#{fname}' (#{root.inspect}) outside agent_fields allowlist"
  if permitted_fields.is_a?(Array) && permitted_fields.any?
    preview = permitted_fields.first(ALLOWLIST_PREVIEW_CAP)
    suffix  = permitted_fields.size > ALLOWLIST_PREVIEW_CAP ?
      " (+#{permitted_fields.size - ALLOWLIST_PREVIEW_CAP} more)" : ""
    msg << ". Allowed: #{preview.join(", ")}#{suffix}"
  end
  kind = :field_denied
  suggested = nil
  if root.is_a?(String) && root.start_with?("_p_")
    kind = :storage_form_field_ref
    bare = root.sub(/\A_p_/, "")
    if !bare.empty? && permitted_fields.is_a?(Array) && permitted_fields.include?(bare)
      suggested = "$#{bare}"
      msg << ". Hint: '#{root}' is the Parse-on-Mongo storage column for the '#{bare}' pointer field — reference '#{bare}' directly (e.g. '#{suggested}')"
    else
      msg << ". Hint: '#{root}' is the Parse-on-Mongo storage column form; reference the bare pointer field name without the '_p_' prefix"
    end
  end
  {
    message:           msg,
    kind:              kind,
    denied_field:      root.is_a?(String) ? root : root.to_s,
    allowed_fields:    permitted_fields.is_a?(Array) ? permitted_fields.first(ALLOWLIST_PREVIEW_CAP).map(&:to_s) : nil,
    suggested_rewrite: suggested,
  }
end

.build_date_group_expression(field_name, interval, timezone) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Build the MongoDB _id expression for a date-grouped aggregation. Mirrors GroupByDate#build_date_group_expression in query.rb, including the timezone-aware operator form Mongo expects.



4118
4119
4120
4121
4122
4123
4124
4125
4126
4127
4128
4129
4130
4131
4132
4133
4134
4135
4136
4137
4138
4139
4140
4141
4142
4143
4144
4145
4146
4147
# File 'lib/parse/agent/tools.rb', line 4118

def build_date_group_expression(field_name, interval, timezone)
  op = ->(name) {
    if timezone
      { name => { "date" => "$#{field_name}", "timezone" => timezone } }
    else
      { name => "$#{field_name}" }
    end
  }
  case interval
  when :year
    op.call("$year")
  when :month
    { "year" => op.call("$year"), "month" => op.call("$month") }
  when :week
    { "year" => op.call("$year"), "week" => op.call("$week") }
  when :day
    { "year" => op.call("$year"), "month" => op.call("$month"), "day" => op.call("$dayOfMonth") }
  when :hour
    { "year" => op.call("$year"), "month" => op.call("$month"),
      "day"  => op.call("$dayOfMonth"), "hour" => op.call("$hour") }
  when :minute
    { "year"   => op.call("$year"),  "month"  => op.call("$month"),
      "day"    => op.call("$dayOfMonth"), "hour" => op.call("$hour"),
      "minute" => op.call("$minute") }
  when :second
    { "year"   => op.call("$year"),  "month"  => op.call("$month"),
      "day"    => op.call("$dayOfMonth"), "hour" => op.call("$hour"),
      "minute" => op.call("$minute"), "second" => op.call("$second") }
  end
end

.build_group_pipeline(where:, group_field:, flatten_arrays:, accumulator_op:, value_field:, operation:, group_expression: nil, agent: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Build the wire-format aggregation pipeline. group_expression is used when a custom _id expression is required (group_by_date); otherwise the _id is "$". When accumulator_op is nil we emit a bare $group with only _id (distinct).



4086
4087
4088
4089
4090
4091
4092
4093
4094
4095
4096
4097
4098
4099
4100
4101
4102
4103
4104
4105
4106
4107
4108
4109
4110
4111
# File 'lib/parse/agent/tools.rb', line 4086

def build_group_pipeline(where:, group_field:, flatten_arrays:,
                         accumulator_op:, value_field:, operation:,
                         group_expression: nil, agent: nil)
  pipeline = []
  if where.is_a?(Hash) && !where.empty?
    pipeline << { "$match" => ConstraintTranslator.translate(where, agent) }
  end
  if flatten_arrays && group_field
    pipeline << { "$unwind" => "$#{group_field}" }
  end

  group_id = group_expression || "$#{group_field}"
  group_stage = { "_id" => group_id }
  if accumulator_op
    # Substitute the value field into the placeholder accumulator.
    # operation == "count" comes through with $sum:1 already.
    if operation == "count"
      group_stage["value"] = accumulator_op
    else
      op_key = accumulator_op.keys.first
      group_stage["value"] = { op_key => "$#{value_field}" }
    end
  end
  pipeline << { "$group" => group_stage }
  pipeline
end

.category_for(name) ⇒ String?

Resolve the category for a tool. Registered tools always carry a category (defaulting to "custom" via register); built-in tools have a category baked into TOOL_DEFINITIONS. Returns nil only when the named tool is unknown.

Parameters:

Returns:



1263
1264
1265
1266
1267
1268
# File 'lib/parse/agent/tools.rb', line 1263

def category_for(name)
  sym = name.to_sym
  entry = REGISTRY_MUTEX.synchronize { @registry[sym] }
  return entry[:definition][:category] if entry
  TOOL_DEFINITIONS.dig(sym, :category)
end

.check_expression_for_restricted_fields!(expr, permitted_fields) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Recursively scan a Mongo-style expression for $field references outside the allowlist. Reused by $replaceRoot/$group enforcement.



2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
# File 'lib/parse/agent/tools.rb', line 2354

def check_expression_for_restricted_fields!(expr, permitted_fields)
  case expr
  when String
    if expr.start_with?("$") && !expr.start_with?("$$")
      ref = expr.sub(/\A\$/, "").split(".").first
      return if ref.empty? || ref.start_with?("$")
      unless permitted_fields.include?(ref)
        raise_allowlist_refusal!("field reference", expr, ref, permitted_fields)
      end
    end
  when Array
    expr.each { |e| check_expression_for_restricted_fields!(e, permitted_fields) }
  when Hash
    expr.each_value { |v| check_expression_for_restricted_fields!(v, permitted_fields) }
  end
end

.check_match_keys_for_restricted_fields!(node, permitted_fields) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Walk a $match hash refusing field keys outside the allowlist. Logical operators ($and/$or/$nor/$not) recurse. $expr expressions use the field-reference walker (which understands +$fieldName+ strings) and are also blocked at the +PipelineSecurity+ layer for forensic operators inside +$expr+.



2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
# File 'lib/parse/agent/tools.rb', line 2326

def check_match_keys_for_restricted_fields!(node, permitted_fields)
  return unless node.is_a?(Hash)
  node.each do |k, v|
    ks = k.to_s
    if %w[$and $or $nor].include?(ks)
      Array(v).each { |sub| check_match_keys_for_restricted_fields!(sub, permitted_fields) }
    elsif ks == "$not"
      check_match_keys_for_restricted_fields!(v, permitted_fields)
    elsif ks == "$expr"
      check_expression_for_restricted_fields!(v, permitted_fields)
    elsif ks.start_with?("$")
      # Top-level $ operator without a field — pass; depth limit
      # and denylist already cover dangerous shapes elsewhere.
      next
    else
      root = ks.split(".").first
      next if root.nil? || root.empty?
      unless permitted_fields.include?(root)
        raise_allowlist_refusal!("match field", ks, root, permitted_fields)
      end
    end
  end
end

.clamp_group_limit(limit, default:, max:) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4014
4015
4016
4017
4018
4019
4020
4021
# File 'lib/parse/agent/tools.rb', line 4014

def clamp_group_limit(limit, default:, max:)
  return default if limit.nil?
  n = Integer(limit) rescue (raise Parse::Agent::ValidationError, "limit must be an integer (got #{limit.inspect})")
  if n < 1 || n > max
    raise Parse::Agent::ValidationError, "limit must be between 1 and #{max}"
  end
  n
end

.client_safe?(name) ⇒ Boolean

Whether a tool is safe to dispatch from a client-mode agent (one whose client has no master_key). Returns true for:

  • Built-in read tools listed in CLIENT_SAFE_READ_TOOLS — these route through session-token REST endpoints that Parse Server natively authorizes (ACL + CLP + protectedFields).
  • Built-in mutation tools listed in CLIENT_SAFE_MUTATION_TOOLS — same REST-native authorization. The caller is responsible for additionally checking the per-agent +allow_mutations+ gate; this predicate reports REST-safety only.
  • Custom tools registered with +client_safe: true+ — the registering author has declared the handler safe for client-mode dispatch.

Anything else (call_method, aggregate, atlas_*, schema tools, custom tools registered without the flag) returns false.

Parameters:

Returns:

  • (Boolean)


1289
1290
1291
1292
1293
1294
1295
# File 'lib/parse/agent/tools.rb', line 1289

def client_safe?(name)
  sym = name.to_sym
  return true if Parse::Agent::CLIENT_SAFE_READ_TOOLS.include?(sym)
  return true if Parse::Agent::CLIENT_SAFE_MUTATION_TOOLS.include?(sym)
  entry = REGISTRY_MUTEX.synchronize { @registry[sym] }
  entry ? entry[:client_safe] == true : false
end

.compact_pointers!(data) ⇒ Hash

Compact Parse-on-Mongo storage-form pointer columns.

Raw aggregate results expose pointer fields as _p_<field>: "<ClassName>$<objectId>". For result sets with many rows that share a few distinct pointer columns, the repeated _p_ column-name prefix and the per-value <ClassName>$ prefix together account for substantial wasted bytes — e.g., 130 rows of _p_author: "_User$xxxxx" repeats the _User$ prefix 130 times for a value the schema already encodes.

This pass:

  1. Walks the result set once to identify every _p_<field> column and the set of class names observed in its values.
  2. For any column whose class name is INVARIANT across all rows (the common case, since pointer columns are typed at schema level), strips the _p_ prefix from the column name and the <ClassName>$ prefix from each value.
  3. Returns the { <field> => <class_name>, ... } map so the caller can attach it to the response envelope as pointer_classes:. The map preserves the type information that was stripped from individual values.

Columns with mixed class names (anomaly — possible via custom $project or bad data) are LEFT UNCOMPRESSED to avoid losing disambiguating data. Columns that already have a non-prefixed collision (e.g., row contains BOTH _p_author AND author) are also left uncompressed — renaming would shadow the existing value.

MUTATES data in place. The hidden-class string scrub in walk_and_redact runs BEFORE this pass, so any _p_* columns referencing a hidden class have already been replaced with a redacted placeholder hash and are skipped by this walker.

Parameters:

Returns:

  • (Hash)

    { "<field>" => "<className>" } for every column that was compressed. Empty hash when nothing qualified.



2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
# File 'lib/parse/agent/tools.rb', line 2746

def compact_pointers!(data)
  # Pass 1: scan. For every key that starts with `_p_`, observe
  # the className portion of each string value across the entire
  # result set. Also note any naming collisions per nesting scope.
  observations = {}  # { "field" => Set[className] | :mixed | :collision }
  scan_for_pointer_columns(data, observations)

  # Decide which columns are safe to compress: exactly one
  # observed className, no collision detected.
  compressible = observations.each_with_object({}) do |(field, observed), acc|
    next if observed == :collision
    next unless observed.is_a?(Hash) && observed.size == 1
    acc[field] = observed.keys.first
  end
  return {} if compressible.empty?

  # Pass 2: rewrite. Walk again and replace `_p_<field>` keys with
  # `<field>` and strip the `<class>$` prefix from each value.
  rewrite_pointer_columns!(data, compressible)
  compressible
end

.compose_atlas_filter(caller_filter, class_name, agent:, apply_canonical_filter: true) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Compose an Atlas Search filter hash that combines the caller's filter: (if any) with the per-agent and (optionally) canonical filters declared for the class. Returns nil when no filters apply (so the underlying invoke_atlas_search drops the filter: opt entirely instead of emitting an empty $match).

TRACK-AGENT-6 / TRACK-AGENT-7 — the per-agent filter (operator scoping) is UNCONDITIONAL; the canonical filter (per-class) is gated on apply_canonical_filter:. Caller- supplied filter: is preserved; all three are AND-merged so neither half can shadow the other.



5719
5720
5721
5722
5723
5724
5725
5726
5727
5728
5729
5730
5731
5732
5733
5734
5735
# File 'lib/parse/agent/tools.rb', line 5719

def compose_atlas_filter(caller_filter, class_name, agent:, apply_canonical_filter: true)
  per_agent = agent && agent.respond_to?(:filter_for) ? agent.filter_for(class_name) : nil
  canonical = apply_canonical_filter ?
                Parse::Agent::MetadataRegistry.canonical_filter(class_name) :
                nil

  parts = []
  parts << per_agent.dup if per_agent && !per_agent.empty?
  parts << canonical.dup if canonical && !canonical.empty?
  parts << caller_filter if caller_filter.is_a?(Hash) && !caller_filter.empty?

  case parts.size
  when 0 then nil
  when 1 then parts.first
  else        { "$and" => parts }
  end
end

.compose_filter_into_pipeline(pipeline, extra) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Shared composition primitive for the per-agent/canonical pipeline-prepend helpers.



1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
# File 'lib/parse/agent/tools.rb', line 1802

def compose_filter_into_pipeline(pipeline, extra)
  return pipeline if extra.nil? || (extra.respond_to?(:empty?) && extra.empty?)

  match_stage = { "$match" => extra.dup }
  if pipeline.first.is_a?(Hash) && pipeline.first.key?("$match")
    [pipeline.first, match_stage] + pipeline.drop(1)
  else
    [match_stage] + pipeline
  end
end

.compose_filter_into_where(where, extra, class_name, helper_name:) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Shared composition primitive for the per-agent/canonical where-merge helpers. Extracted so the wire-format shape (nil/empty → return where unchanged, single extra unwrapped, multiple extras under $and) stays identical regardless of which helper assembled the filters.



1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
# File 'lib/parse/agent/tools.rb', line 1756

def compose_filter_into_where(where, extra, class_name, helper_name:)
  return where if extra.nil? || (extra.respond_to?(:empty?) && extra.empty?)

  if where.is_a?(Hash) && !where.empty?
    { "$and" => [extra.dup, where] }
  elsif where.nil? || (where.is_a?(Hash) && where.empty?)
    extra.dup
  else
    # Silently dropping the filter when `where` is not a Hash
    # would allow the "valid state" / per-agent restriction to
    # be bypassed by passing a non-Hash value. Raise so callers
    # discover the contract violation immediately rather than
    # receiving silently unfiltered results.
    raise ArgumentError,
          "#{helper_name}: where must be a Hash or nil/empty " \
          "when a filter is declared for #{class_name}, " \
          "got #{where.class}"
  end
end

.compute_source_allowlist_for(class_name) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Materialize the source-class allowlist into the form the walker expects (source fields ∪ ALWAYS_KEEP_FIELDS), or nil if the class has no agent_fields declared (in which case the walker short-circuits and applies no field-level enforcement).



1886
1887
1888
1889
1890
# File 'lib/parse/agent/tools.rb', line 1886

def compute_source_allowlist_for(class_name)
  allowlist = Parse::Agent::MetadataRegistry.field_allowlist(class_name)
  return nil unless allowlist && allowlist.any?
  allowlist.map(&:to_s) | Parse::Agent::MetadataRegistry::ALWAYS_KEEP_FIELDS
end

.dry_run_envelope(class_name:, pipeline:, params:) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Build the dry-run response envelope. Surfaces the exact pipeline the handler would have executed plus the resolved parameters so a caller can hand the pipeline to aggregate (or mutate and re-issue) without re-deriving the field-mangling rules.



4305
4306
4307
4308
4309
4310
4311
4312
4313
4314
4315
4316
# File 'lib/parse/agent/tools.rb', line 4305

def dry_run_envelope(class_name:, pipeline:, params:)
  cleaned = params.reject { |_, v| v.nil? || (v.respond_to?(:empty?) && v.empty?) }
  {
    dry_run: true,
    class_name: class_name,
    pipeline: pipeline,
    parameters: cleaned,
    hint: "dry_run mode — the pipeline above was constructed but NOT executed. " \
          "Re-issue this call with dry_run: false to run it, or pass the pipeline " \
          "to the aggregate tool (modified as needed) for full pipeline control.",
  }
end

.effective_permitted_set(source_permitted, available, source_addressable) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Effective permitted set at the entry of a single stage. nil means "no enforcement" (source class has no agent_fields). When the source allowlist no longer applies (a schema-replacing stage flipped source_addressable false), the effective set is just the synthetic-field accumulator.



1932
1933
1934
1935
1936
1937
1938
# File 'lib/parse/agent/tools.rb', line 1932

def effective_permitted_set(source_permitted, available, source_addressable)
  return nil if source_permitted.nil?
  set = []
  set |= source_permitted if source_addressable
  set |= available
  set
end

.enforce_pipeline_access_policy!(class_name, pipeline, agent: nil) ⇒ Object

Walk an aggregation pipeline and enforce two boundaries that assert_class_accessible! (top-level class only) and PipelineValidator (operator denylist) cannot enforce:

  1. Refuse cross-class reads into a hidden class via the from: or coll: field of $lookup, $graphLookup, $unionWith. Sub-pipelines inside those stages are walked recursively.
  2. When the top-level class has an agent_fields allowlist, refuse projection-shape stages ($project, $addFields, $set, $unset, $replaceRoot, $replaceWith) that name or reference fields outside the allowlist. $facet sub-pipelines are walked carrying the same allowlist.

Forward-pass field tracking: each stage's allowlist check uses the effective set (source ∪ available_so_far), where available_so_far accumulates fields introduced by upstream stages ($group._id and accumulator keys, $addFields outputs, $lookup.as, etc.). Schema-replacing stages ($project, $group, $bucket, $replaceRoot, $replaceWith, $facet) drop the source set so downstream stages can ONLY reference newly-introduced fields. This unblocks the canonical "group → filter → sort → limit" pattern that previously failed because synthetic accumulator outputs were treated as source-class references against the allowlist.

Raises Parse::Agent::AccessDenied on any breach. The Agent#execute rescue chain translates that to error_code: :access_denied.



1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
# File 'lib/parse/agent/tools.rb', line 1868

def enforce_pipeline_access_policy!(class_name, pipeline, agent: nil)
  return unless pipeline.is_a?(Array)
  source_permitted = compute_source_allowlist_for(class_name)
  walk_pipeline_with_state!(
    pipeline,
    source_permitted: source_permitted,
    available: [],
    source_addressable: true,
    agent: agent,
  )
end

.ensure_aggregate_terminal_limit(pipeline) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns [pipeline_to_run, auto_limited?]. When the caller's last stage is $limit or $count, the pipeline is returned unchanged. The check walks forward to find the last stage, ignoring $sort/$project/ $addFields/$unset trailers because those don't bound cardinality.



3653
3654
3655
3656
3657
3658
3659
3660
# File 'lib/parse/agent/tools.rb', line 3653

def ensure_aggregate_terminal_limit(pipeline)
  return [pipeline, false] unless pipeline.is_a?(Array) && pipeline.any?
  last = pipeline.last
  return [pipeline, false] unless last.is_a?(Hash)
  op = last.keys.first.to_s
  return [pipeline, false] if op == "$limit" || op == "$count"
  [pipeline + [{ "$limit" => AGGREGATE_DEFAULT_LIMIT }], true]
end

.execute_count_via_direct(agent, class_name, where: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Count variant of #execute_find_via_direct. Routes through Parse::MongoDB.aggregate with a terminal $count stage. Returns an Integer (the count), mirroring what Parse Server REST find_objects with count: 1 would surface as response.count.



5651
5652
5653
5654
5655
5656
5657
5658
5659
5660
5661
5662
5663
# File 'lib/parse/agent/tools.rb', line 5651

def execute_count_via_direct(agent, class_name, where: nil)
  pipeline = []
  if where.is_a?(Hash) && !where.empty?
    translated_match = Parse::Query.new(class_name).send(
      :convert_stage_for_direct_mongodb, { "$match" => where },
    )
    pipeline << translated_match
  end
  pipeline << { "$count" => "count" }
  raw_rows = Parse::MongoDB.aggregate(class_name, pipeline, **agent.acl_scope_kwargs)
  return 0 if raw_rows.empty?
  raw_rows.first["count"] || 0
end

.execute_find_via_direct(agent, class_name, where: nil, limit: nil, skip: 0, order: nil, keys: nil, include: nil) ⇒ Array<Hash>

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Build the auth kwargs hash forwarded to MongoDB.aggregate from the #aggregate tool's mongo_direct branch. Mirrors #atlas_auth_options! but with a friendlier default for the aggregate path: when the agent has no session_token, fall back to master: true rather than refusing the call. The agent's existing class/field/tenant/canonical-filter gates already form the security boundary for the aggregate tool, so a session-less agent should run with master-equivalent semantics (preserving pre-4.4.0 behavior). Atlas Search refuses session-less calls because it was a new attack surface; aggregate has been around longer and its enforcement model is class-based, not row-based.

Session-tokened agents get a real upgrade here: pre-4.4.0, Parse::MongoDB.aggregate ignored session tokens and ran with admin Mongo credentials, so row-ACL was unenforced on the mongo-direct path. With this helper, a session-tokened agent's mongo-direct aggregate gets the same ACLScope _rperm enforcement the REST route gets via Parse Server.

LLM-supplied kwargs are NOT honored here — the calling tool signature swallows unknown kwargs into **_kwargs and the call site at line 2562 does not splat them. The posture comes entirely from agent instance state. Auto-route a REST find_objects-style call through Parse::MongoDB.aggregate when the agent's scope is acl_user / acl_role (no REST "act as role" surface exists). For session_token / master-key agents the caller keeps the existing REST path — Parse Server's REST find_objects honors session_token natively. Returns an Array of Parse-format Hashes shaped like response.results, so callers can plug it in as a drop-in replacement.

The pipeline composition reuses Parse::Query's #build_direct_mongodb_pipeline for sort / include $lookup / keys $project / skip / limit, then prepends the caller's raw where: as a translated $match stage (field references and pointer values rewritten via #convert_stage_for_direct_mongodb). ACLScope's _rperm injection runs inside Parse::MongoDB.aggregate.

Parameters:

  • agent (Parse::Agent)
  • class_name (String)
  • where (Hash, nil) (defaults to: nil)

    wire-format constraints (already passed through ConstraintTranslator#translate — i.e. MongoDB-shaped keys like $gt / $or, but Parse-shaped pointer values).

  • limit (Integer, nil) (defaults to: nil)
  • skip (Integer, nil) (defaults to: 0)
  • order (String, nil) (defaults to: nil)

    Parse-style "field" or "-field" form.

  • keys (Array<String>, nil) (defaults to: nil)

    projection allowlist.

  • include (Array<String>, nil) (defaults to: nil)

    pointer paths to $lookup.

Returns:



5607
5608
5609
5610
5611
5612
5613
5614
5615
5616
5617
5618
5619
5620
5621
5622
5623
5624
5625
5626
5627
5628
5629
5630
5631
5632
5633
5634
5635
5636
5637
5638
5639
5640
5641
5642
5643
# File 'lib/parse/agent/tools.rb', line 5607

def execute_find_via_direct(agent, class_name, where: nil, limit: nil,
                            skip: 0, order: nil, keys: nil, include: nil)
  q = Parse::Query.new(class_name)
  q.limit(limit) if limit && limit > 0
  q.skip(skip) if skip && skip > 0
  # `order:` arrives as a single Parse-style string ("createdAt" or
  # "-createdAt") from query_class; Parse::Query#order accepts it
  # via the Order coercion path.
  if order
    if order.is_a?(String)
      # split comma-separated forms — query_class accepts a single
      # string today but Parse Server's order parameter accepts
      # multiple comma-separated keys.
      order.split(",").each { |o| q.order(o.strip) unless o.strip.empty? }
    else
      q.order(order)
    end
  end
  q.keys(*keys) if keys && keys.any?
  q.includes(*include) if include && include.any?

  pipeline = q.send(:build_direct_mongodb_pipeline)

  # Prepend the caller's where: as a $match stage. The translator
  # rewrites pointer-shaped values (`{__type: Pointer, ...}` →
  # `"ClassName$objectId"`), pointer field references (`author` →
  # `_p_author`), and date / file shapes for direct-MongoDB
  # storage form. Without this, the wire-format where would miss
  # rows that the REST find_objects path matches.
  if where.is_a?(Hash) && !where.empty?
    translated_match = q.send(:convert_stage_for_direct_mongodb, { "$match" => where })
    pipeline = [translated_match] + pipeline
  end

  raw_rows = Parse::MongoDB.aggregate(class_name, pipeline, **agent.acl_scope_kwargs)
  raw_rows.map { |raw| Parse::MongoDB.convert_aggregation_document(raw) }
end

.export_via_aggregate(agent, class_name:, pipeline:, scope: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4547
4548
4549
4550
4551
4552
4553
4554
4555
4556
4557
4558
4559
4560
4561
4562
4563
4564
4565
4566
4567
4568
4569
4570
4571
4572
4573
4574
4575
4576
4577
4578
4579
4580
4581
4582
4583
4584
4585
4586
# File 'lib/parse/agent/tools.rb', line 4547

def export_via_aggregate(agent, class_name:, pipeline:, scope: nil)
  PipelineValidator.validate!(pipeline)
  enforce_pipeline_access_policy!(class_name, pipeline, agent: agent)
  # Prepend tenant scope $match before per-agent + canonical filter and auto-limit.
  scoped_pipeline = apply_tenant_scope_to_pipeline(pipeline, scope)
  # TRACK-AGENT-7 split: per-agent filter is UNCONDITIONAL.
  scoped_pipeline = apply_per_agent_filter_to_pipeline(scoped_pipeline, class_name, agent: agent)
  # H3: apply canonical filter so pipeline-mode exports exclude the same
  # rows that query_class (and the conversational aggregate tool) exclude.
  scoped_pipeline = apply_canonical_filter_to_pipeline(scoped_pipeline, class_name, agent: agent)
  # Same auto-limit injection as the conversational aggregate tool —
  # an unbounded pipeline against a high-cardinality $group would
  # blow past export_data's row_cap and still hand the server-side
  # cost to Parse. The row_cap clips the result for display; this
  # clips the underlying query too.
  effective_pipeline, _auto_limited = ensure_aggregate_terminal_limit(scoped_pipeline)

  # Route to mongo-direct under acl_user/acl_role scope; otherwise
  # the existing REST path handles session_token / master-key.
  use_direct = agent.respond_to?(:acl_scope_requires_direct?) &&
               agent.acl_scope_requires_direct? &&
               defined?(Parse::MongoDB) && Parse::MongoDB.enabled?

  rows = nil
  with_timeout(:export_data) do
    if use_direct
      translated = Parse::Query.new(class_name).send(
        :translate_pipeline_for_direct_mongodb, effective_pipeline,
      )
      raw_rows = Parse::MongoDB.aggregate(class_name, translated, **agent.acl_scope_kwargs)
      rows = raw_rows.map { |raw| Parse::MongoDB.convert_aggregation_document(raw) }
    else
      response = agent.client.aggregate_pipeline(class_name, effective_pipeline, **agent.request_opts)
      raise "Export aggregation failed: #{response.error}" unless response.success?
      rows = response.results
    end
  end

  redact_hidden_classes!(rows, agent: agent)
end

.export_via_query(agent, class_name:, where:, keys:, include:, order:, limit:, skip: nil, scope: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4476
4477
4478
4479
4480
4481
4482
4483
4484
4485
4486
4487
4488
4489
4490
4491
4492
4493
4494
4495
4496
4497
4498
4499
4500
4501
4502
4503
4504
4505
4506
4507
4508
4509
4510
4511
4512
4513
4514
4515
4516
4517
4518
4519
4520
4521
4522
4523
4524
4525
4526
4527
4528
4529
4530
4531
4532
4533
4534
4535
4536
4537
4538
4539
4540
4541
4542
4543
# File 'lib/parse/agent/tools.rb', line 4476

def export_via_query(agent, class_name:, where:, keys:, include:, order:, limit:, skip: nil, scope: nil)
  # Reuse query_class's gates by routing through it directly.
  # query_class returns a ResultFormatter-wrapped hash; we want the raw rows.
  query = {}
  query[:limit] = [limit || Agent::DEFAULT_LIMIT, Agent::MAX_LIMIT].min
  query[:skip]  = skip if skip && skip > 0
  query[:order] = order if order

  # NEW-TOOLS-5: validate keys: against identifier regex before
  # the allowlist intersection. This refuses leading-underscore
  # names (_hashed_password, _session_token, _rperm, _wperm, etc.)
  # at the boundary, so they cannot leak through even on classes
  # WITHOUT an agent_fields allowlist.
  validated_keys = validate_keys!(keys)
  allowlist = MetadataRegistry.field_allowlist(class_name)
  caller_keys = validated_keys&.any? ? validated_keys : nil
  effective_keys =
    if allowlist && allowlist.any?
      permitted = allowlist.map(&:to_s) | MetadataRegistry::ALWAYS_KEEP_FIELDS
      caller_keys ? (caller_keys & permitted) : allowlist.map(&:to_s)
    else
      caller_keys
    end

  include = validate_include!(include)
  assert_include_paths_accessible!(class_name, include, agent: agent)
  # keys-on-include auto-projection. Export rows are the bulkiest
  # surface (full-row CSV / Markdown / JSON dumps), so trimming
  # included pointers to their agent_join_fields here pays back the
  # most in token cost. The truncation map is not surfaced on the
  # export envelope (callers consuming CSV/Markdown don't carry
  # structured metadata) but the per-row payload shape is identical
  # to query_class.
  include_projection = apply_include_projection(class_name, effective_keys, include)
  effective_keys = include_projection[:effective_keys]
  query[:keys] = effective_keys.join(",") if effective_keys&.any?
  query[:include] = include.join(",") if include&.any?

  effective_where = apply_tenant_scope_to_where(where, scope, class_name)
  # TRACK-AGENT-7 split: per-agent filter is UNCONDITIONAL.
  effective_where = apply_per_agent_filter_to_where(effective_where, class_name, agent: agent)
  # H3: apply the class's canonical filter so exported rows are the same
  # "valid state" subset that query_class returns. Exporting soft-deleted
  # rows when query_class hides them would be a silent data-integrity gap.
  effective_where = apply_canonical_filter_to_where(effective_where, class_name, agent: agent)
  translated_where = nil
  if effective_where && !effective_where.empty?
    translated_where = ConstraintTranslator.translate(effective_where, agent)
    query[:where] = translated_where.to_json
  end

  rows = nil
  with_timeout(:export_data) do
    if agent.respond_to?(:acl_scope_requires_direct?) && agent.acl_scope_requires_direct?
      rows = execute_find_via_direct(
        agent, class_name,
        where: translated_where, limit: query[:limit], skip: query[:skip] || 0,
        order: order, keys: effective_keys, include: include,
      )
    else
      response = agent.client.find_objects(class_name, query, **agent.request_opts)
      raise "Export query failed: #{response.error}" unless response.success?
      rows = response.results
    end
  end

  redact_hidden_classes!(rows, agent: agent)
end

.extract_export_value(row, path) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Extract a value via a dotted path. Each segment indexes into a Hash under either its string or symbol form. Returns nil if any segment misses or hits a non-Hash.



4700
4701
4702
4703
4704
4705
4706
4707
4708
4709
# File 'lib/parse/agent/tools.rb', line 4700

def extract_export_value(row, path)
  path.to_s.split(".").reduce(row) do |acc, seg|
    break nil if acc.nil?
    if acc.is_a?(Hash)
      acc[seg] || acc[seg.to_sym]
    else
      nil
    end
  end
end

.extract_pointer_class!(pairs) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Inspect the keys of a [[key, value], ...] pair list. When every non-nil key is a string of the form "$" with the SAME Class, strip the prefix and return [Class, rewritten_pairs]. Otherwise return [nil, pairs] unchanged.



4242
4243
4244
4245
4246
4247
4248
4249
4250
4251
# File 'lib/parse/agent/tools.rb', line 4242

def extract_pointer_class!(pairs)
  classes = pairs.map { |k, _| k.is_a?(String) && k.match(/\A([A-Z_]\w*)\$(\w+)\z/) ? Regexp.last_match(1) : nil }
  return [nil, pairs] if classes.compact.empty?
  return [nil, pairs] if classes.compact.uniq.size != 1
  # Allow some keys to be nil (missing values) — they pass through.
  return [nil, pairs] unless pairs.all? { |k, _| k.nil? || (k.is_a?(String) && k.include?("$")) }
  cls = classes.compact.first
  rewritten = pairs.map { |k, v| [k.is_a?(String) ? k.sub(/\A#{cls}\$/, "") : k, v] }
  [cls, rewritten]
end

.format_date_key(key, interval) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Format a date _id back into an ISO-style string for the LLM. nil / unrecognized values are surfaced as the literal "null".



4153
4154
4155
4156
4157
4158
4159
4160
4161
4162
4163
4164
4165
4166
4167
4168
4169
4170
4171
# File 'lib/parse/agent/tools.rb', line 4153

def format_date_key(key, interval)
  return "null" if key.nil?
  if interval == :year
    return key.to_s
  end
  return "null" unless key.is_a?(Hash)
  y, mo, d = key["year"], key["month"], key["day"]
  h, mi, s = key["hour"], key["minute"], key["second"]
  wk       = key["week"]
  case interval
  when :month  then (y.nil? || mo.nil?) ? "null" : sprintf("%04d-%02d", y, mo)
  when :week   then (y.nil? || wk.nil?) ? "null" : sprintf("%04d-W%02d", y, wk)
  when :day    then (y.nil? || mo.nil? || d.nil?) ? "null" : sprintf("%04d-%02d-%02d", y, mo, d)
  when :hour   then (y.nil? || mo.nil? || d.nil? || h.nil?) ? "null" : sprintf("%04d-%02d-%02d %02d:00", y, mo, d, h)
  when :minute then (y.nil? || mo.nil? || d.nil? || h.nil? || mi.nil?) ? "null" : sprintf("%04d-%02d-%02d %02d:%02d", y, mo, d, h, mi)
  when :second then (y.nil? || mo.nil? || d.nil? || h.nil? || mi.nil? || s.nil?) ? "null" : sprintf("%04d-%02d-%02d %02d:%02d:%02d", y, mo, d, h, mi, s)
  else "null"
  end
end

.format_export_csv(headers, rows) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4726
4727
4728
4729
4730
4731
4732
# File 'lib/parse/agent/tools.rb', line 4726

def format_export_csv(headers, rows)
  require "csv"
  CSV.generate do |csv|
    csv << headers
    rows.each { |r| csv << r }
  end
end

.format_export_markdown(headers, rows) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4736
4737
4738
4739
4740
4741
4742
4743
# File 'lib/parse/agent/tools.rb', line 4736

def format_export_markdown(headers, rows)
  return "" if headers.empty?
  lines = []
  lines << "| #{headers.join(" | ")} |"
  lines << "| #{headers.map { "---" }.join(" | ")} |"
  rows.each { |r| lines << "| #{r.map { |c| c.to_s.gsub(/\r?\n/, " ").gsub(/([\\|])/, '\\\\\1') }.join(" | ")} |" }
  lines.join("\n")
end

.format_export_text_table(headers, rows) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4747
4748
4749
4750
4751
4752
4753
4754
4755
4756
4757
# File 'lib/parse/agent/tools.rb', line 4747

def format_export_text_table(headers, rows)
  return "" if headers.empty?
  widths = headers.each_with_index.map do |h, i|
    [h.to_s.length, *rows.map { |r| r[i].to_s.length }].max
  end
  sep = "+-" + widths.map { |w| "-" * w }.join("-+-") + "-+"
  fmt = ->(cells) {
    "| " + cells.each_with_index.map { |c, i| c.to_s.ljust(widths[i]) }.join(" | ") + " |"
  }
  ([sep, fmt.call(headers), sep] + rows.map(&fmt) + [sep]).join("\n")
end

.format_query_results_as(format, class_name, results) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Format query results as CSV / Markdown / text-table using the same helpers export_data uses. Columns are inferred from the first row (skipping Parse-internal envelope keys). Returns the standard text-export envelope shape.



3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
# File 'lib/parse/agent/tools.rb', line 3051

def format_query_results_as(format, class_name, results)
  col_specs = results.any? ? infer_export_columns_from(results.first) : []
  headers   = col_specs.map { |s| s[:header] }
  rows      = results.map do |obj|
    col_specs.map { |s| stringify_export_value(extract_export_value(obj, s[:path])) }
  end
  output = case format
    when "csv"      then format_export_csv(headers, rows)
    when "markdown" then format_export_markdown(headers, rows)
    when "table"    then format_export_text_table(headers, rows)
    end
  {
    class_name: class_name,
    format:     format,
    headers:    headers,
    row_count:  rows.size,
    output:     output,
  }
end

.infer_export_columns_from(sample_row) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4686
4687
4688
4689
4690
4691
4692
4693
# File 'lib/parse/agent/tools.rb', line 4686

def infer_export_columns_from(sample_row)
  return [{ path: "objectId", header: "objectId" }] unless sample_row.is_a?(Hash)
  # Skip Parse-internal envelope keys; surface real data columns.
  skip = %w[__type className ACL _rperm _wperm _hashed_password _session_token]
  sample_row.keys.reject { |k| skip.include?(k.to_s) }.map do |k|
    { path: k.to_s, header: k.to_s }
  end
end

.invoke(agent, name, **kwargs) ⇒ Object

Dispatch a tool call. Registered tools take precedence over builtins only when both share a name; otherwise each path is exclusive.

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • name (Symbol, String)

    tool name

  • kwargs (Hash)

    keyword arguments forwarded to handler or builtin



1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
# File 'lib/parse/agent/tools.rb', line 1204

def invoke(agent, name, **kwargs)
  sym = name.to_sym
  entry = REGISTRY_MUTEX.synchronize { @registry[sym] }

  if entry
    entry[:handler].call(agent, **kwargs)
  else
    Tools.send(sym, agent, **kwargs)
  end
end

.keys_excluding_operators(hash) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Top-level keys of a stage operand, dropping operator-prefixed entries (which name aggregation expressions, not output fields).



2042
2043
2044
# File 'lib/parse/agent/tools.rb', line 2042

def keys_excluding_operators(hash)
  hash.keys.map(&:to_s).reject { |k| k.empty? || k.start_with?("$") }
end

.normalize_export_columns(columns, sample_row) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Resolve a columns: parameter into an ordered list of { path: "dotted.field.path", header: "Display Name" } entries. When columns is nil, infer from the first row.



4611
4612
4613
4614
4615
4616
4617
4618
4619
4620
4621
4622
4623
4624
4625
4626
4627
4628
4629
4630
4631
4632
4633
4634
4635
4636
4637
4638
4639
# File 'lib/parse/agent/tools.rb', line 4611

def normalize_export_columns(columns, sample_row)
  return infer_export_columns_from(sample_row) if columns.nil? || (columns.respond_to?(:empty?) && columns.empty?)
  unless columns.is_a?(Array)
    raise Parse::Agent::ValidationError, "columns must be an Array (got #{columns.class})"
  end

  columns.map do |spec|
    case spec
    when String
      validate_export_column_path!(spec)
      { path: spec, header: spec }
    when Hash
      unless spec.size == 1
        raise Parse::Agent::ValidationError,
              "column hash must have exactly one field => header entry (got #{spec.inspect})"
      end
      path, header = spec.first
      path_str = path.to_s
      validate_export_column_path!(path_str)
      { path: path_str, header: header.to_s }
    when Symbol
      validate_export_column_path!(spec.to_s)
      { path: spec.to_s, header: spec.to_s }
    else
      raise Parse::Agent::ValidationError,
            "column spec must be String, Symbol, or single-entry Hash (got #{spec.class})"
    end
  end
end

.normalize_group_key(key) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Normalize a group key for the wire envelope: nil → "null", everything else is left as-is (Parse Server returns strings, numbers, booleans, and dates verbatim).



4362
4363
4364
# File 'lib/parse/agent/tools.rb', line 4362

def normalize_group_key(key)
  key.nil? ? "null" : key
end

.normalize_group_sort(sort) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4002
4003
4004
4005
4006
4007
4008
4009
4010
# File 'lib/parse/agent/tools.rb', line 4002

def normalize_group_sort(sort)
  return nil if sort.nil? || sort.to_s.empty?
  allowed = %w[value_desc value_asc key_desc key_asc]
  unless allowed.include?(sort.to_s)
    raise Parse::Agent::ValidationError,
          "sort #{sort.inspect} is invalid. Must be one of #{allowed.join(', ')}."
  end
  sort.to_s
end

.notify_subscribersObject

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Invoke every registered subscriber outside the registry mutex so callback work cannot deadlock against a concurrent register or reset_registry! call. Snapshot under the mutex; iterate over the snapshot without it.



1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
# File 'lib/parse/agent/tools.rb', line 1327

def notify_subscribers
  snapshot = REGISTRY_MUTEX.synchronize { @subscribers.dup }
  snapshot.each do |callback|
    begin
      callback.call
    rescue StandardError => e
      warn "[Parse::Agent::Tools] subscriber raised: #{e.class}: #{e.message}"
    end
  end
end

.output_schema_for(name) ⇒ Hash?

Resolve the MCP outputSchema for a tool, if any. Checks registered tools first (override path), then falls through to the built-in TOOL_DEFINITIONS table. The dispatcher uses this to decide whether to mirror the result data as structuredContent in the tools/call response envelope.

Parameters:

Returns:

  • (Hash, nil)

    JSON Schema Hash, or nil if not declared.



1249
1250
1251
1252
1253
1254
# File 'lib/parse/agent/tools.rb', line 1249

def output_schema_for(name)
  sym = name.to_sym
  entry = REGISTRY_MUTEX.synchronize { @registry[sym] }
  return entry[:output_schema] if entry && entry[:output_schema]
  TOOL_DEFINITIONS.dig(sym, :output_schema)
end

.permission_for(name) ⇒ Symbol

Resolve the permission level for a tool (builtin or registered).

Parameters:

Returns:

  • (Symbol)

    :readonly, :write, :admin, or :unknown



1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
# File 'lib/parse/agent/tools.rb', line 1219

def permission_for(name)
  sym = name.to_sym
  entry = REGISTRY_MUTEX.synchronize { @registry[sym] }
  return entry[:permission] if entry

  Parse::Agent::PERMISSION_LEVELS.each do |level, tools|
    return level if tools.include?(sym)
  end
  :unknown
end

.project_introduced_roots(value) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Top-level field names introduced by a non-exclusion $project, root-normalized. Skips exclusion entries (x: 0) so an {name: 1, _id: 0} spec doesn't accidentally register _id as the only available downstream field.



2084
2085
2086
2087
# File 'lib/parse/agent/tools.rb', line 2084

def project_introduced_roots(value)
  included = value.reject { |_, v| projection_is_exclusion?(v) }
  root_keys_excluding_operators(included)
end

.project_is_exclusion_only?(value) ⇒ Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

True when every value in the $project hash is an exclusion (0 or false). Such a projection KEEPS every non-named source field, so the forward-pass walker must not treat it as schema-replacing.

Returns:

  • (Boolean)


2074
2075
2076
# File 'lib/parse/agent/tools.rb', line 2074

def project_is_exclusion_only?(value)
  value.any? && value.values.all? { |v| projection_is_exclusion?(v) }
end

.project_object_to_allowlist(class_name, object_hash) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Project a formatted-object Hash through the class's agent_fields allowlist when one is declared. The allowlist union with ALWAYS_KEEP_FIELDS keeps the standard envelope (objectId, createdAt, updatedAt) on every projection. When no allowlist is declared, return the input untouched.



5349
5350
5351
5352
5353
5354
5355
5356
5357
5358
5359
5360
5361
# File 'lib/parse/agent/tools.rb', line 5349

def project_object_to_allowlist(class_name, object_hash)
  return object_hash unless object_hash.is_a?(Hash)
  allowlist = Parse::Agent::MetadataRegistry.field_allowlist(class_name)
  return object_hash unless allowlist && allowlist.any?
  # Preserve the className tag and any __type metadata so downstream
  # redactors / formatters can still walk the structure.
  preserved = %w[className __type __redacted]
  allowed = allowlist.map(&:to_s) | preserved
  object_hash.each_with_object({}) do |(k, v), acc|
    ks = k.to_s
    acc[k] = v if allowed.include?(ks)
  end
end

.projection_is_exclusion?(expr) ⇒ Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns:

  • (Boolean)


2064
2065
2066
# File 'lib/parse/agent/tools.rb', line 2064

def projection_is_exclusion?(expr)
  expr == 0 || expr == false
end

.projection_is_inclusion?(expr) ⇒ Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns:

  • (Boolean)


2058
2059
2060
# File 'lib/parse/agent/tools.rb', line 2058

def projection_is_inclusion?(expr)
  expr == 1 || expr == true
end

.raise_allowlist_refusal!(context, fname, root, permitted_fields) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Helper: raise AccessDenied with the structured payload from build_allowlist_refusal. Used by every pipeline-walker refusal site so they all emit the same shape.



2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
# File 'lib/parse/agent/tools.rb', line 2425

def raise_allowlist_refusal!(context, fname, root, permitted_fields)
  info = build_allowlist_refusal(context, fname, root, permitted_fields)
  raise Parse::Agent::AccessDenied.new(
    nil, info[:message],
    kind:              info[:kind],
    denied_field:      info[:denied_field],
    allowed_fields:    info[:allowed_fields],
    suggested_rewrite: info[:suggested_rewrite],
  )
end

.redact_hidden_classes!(data, agent: nil) ⇒ Object

Post-fetch defense-in-depth: walk the result data and replace any nested object whose className matches a hidden class with a redacted placeholder. Catches paths that bypass assert_include_paths_accessible! (free-form include: names, server-side $lookup output, anything we can't resolve via belongs_to reflection at request time).



2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
# File 'lib/parse/agent/tools.rb', line 2617

def redact_hidden_classes!(data, agent: nil)
  hidden = Parse::Agent::MetadataRegistry.hidden_class_names
  # Always walk for internal-field stripping even when the hidden-class
  # set is empty — `walk_and_redact` does double duty as the per-process
  # floor that drops `INTERNAL_FIELDS_DENYLIST` keys (sessionToken,
  # _hashed_password, etc.) from every nested document in the response.
  # The denylist must apply regardless of class-visibility state: a
  # deliberate `agent_unhidden` on `_Session` exposes the class to the
  # agent surface but the bearer token still gets stripped here.
  #
  # When an `agent:` is propagated, the walker ALSO redacts nested
  # objects whose className is outside the agent's per-agent allowlist.
  # This is the defense-in-depth complement to walk_pointer_path! and
  # walk_pipeline_stage!: those gates refuse the join up-front, but a
  # server-side $lookup we couldn't statically resolve (free-form
  # `from:` value, raw pipeline path) can still produce off-allowlist
  # rows. The walker scrubs them post-fetch.
  walk_and_redact(data, hidden, agent: agent)
end

.redact_hidden_pointer_groups!(pointer_class, pairs, agent: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

H2: After extract_pointer_class! resolves [cls, pairs], check whether +cls+ is an agent_hidden class. If it is, replace every key (objectId) with nil so the hidden class's row identifiers are not surfaced, and return nil as the pointer_class so the class name is also suppressed from the envelope.

Returns [pointer_class, pairs] — unchanged when not hidden, redacted when the class is hidden.



4263
4264
4265
4266
4267
4268
4269
4270
4271
# File 'lib/parse/agent/tools.rb', line 4263

def redact_hidden_pointer_groups!(pointer_class, pairs, agent: nil)
  return [pointer_class, pairs] if pointer_class.nil?
  return [pointer_class, pairs] unless Parse::Agent::MetadataRegistry.hidden?(pointer_class) ||
                                       (agent && !agent.class_filter_permits?(pointer_class))

  # Redact: suppress class name and zero out all objectId keys.
  redacted_pairs = pairs.map { |_k, v| [nil, v] }
  [nil, redacted_pairs]
end

.register(name:, description:, parameters:, permission:, handler:, timeout: DEFAULT_TIMEOUT, output_schema: nil, category: "custom", client_safe: false) ⇒ Object

Note:

SECURITY: Registered tool handlers run as trusted code inside the gem's process. Specifically, handlers:

  • Receive the bare Parse::Agent (can read session_token, can call internal methods, can mutate class-level metadata like agent_allow_collscan).
  • Bypass the COLLSCAN preflight check enforced by built-in tools when they query Parse directly (via .results_direct, Parse::MongoDB, or Parse::Object#query). Implement your own indexing discipline.
  • Bypass the agent_fields allowlist enforced by built-in tools when they return raw Parse::Object instances. Project fields manually in the handler.
  • Receive a Timeout.timeout budget derived from TOOL_TIMEOUTS (or the custom :timeout kwarg), but note that Parse Server's REST surface does not accept maxTimeMS — the only timeout is the Ruby-level one.

Treat the handler list as part of your application's trust boundary: register at boot from code you control; never accept registrations from configuration files at runtime.

Register a custom tool. Thread-safe. Idempotent on name (replaces).

Parameters:

  • name (Symbol)

    unique tool name (required)

  • description (String)

    human-readable description (required)

  • parameters (Hash)

    JSON Schema object definition (required)

  • permission (Symbol)

    :readonly, :write, or :admin (required)

  • timeout (Integer) (defaults to: DEFAULT_TIMEOUT)

    seconds before ToolTimeoutError (default: 30)

  • handler (Proc)

    lambda(agent, **args) -> Hash (required)

  • client_safe (Boolean) (defaults to: false)

    when +true+, the tool is dispatchable from a client-mode agent (one whose client has no master_key). Default +false+ — registered tools are master-key-only unless the author explicitly declares them safe for session-token contexts. The handler is responsible for routing through +agent.client+ with +agent.session_token+ rather than touching the master key directly.

Raises:

  • (ArgumentError)

    when required kwargs are missing or permission is invalid



1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
# File 'lib/parse/agent/tools.rb', line 1099

def register(name:, description:, parameters:, permission:, handler:,
             timeout: DEFAULT_TIMEOUT, output_schema: nil, category: "custom",
             client_safe: false)
  unless %i[readonly write admin].include?(permission)
    raise ArgumentError, "permission must be :readonly, :write, or :admin (got #{permission.inspect})"
  end
  raise ArgumentError, "handler must be a callable (Proc/lambda)" unless handler.respond_to?(:call)
  raise ArgumentError, "name is required" if name.nil?
  raise ArgumentError, "description is required" if description.nil? || description.to_s.empty?
  raise ArgumentError, "parameters is required" if parameters.nil?
  if output_schema && !output_schema.is_a?(Hash)
    raise ArgumentError, "output_schema must be a Hash (JSON Schema), got #{output_schema.class}"
  end
  category_str = category.to_s
  raise ArgumentError, "category must be a non-empty string" if category_str.empty?

  sym = name.to_sym
  # NEW-TOOLS-6: refuse names that collide with a builtin tool. The
  # dispatcher checks the per-process registry FIRST and only falls
  # through to the builtin when no entry is present, so a silently-
  # accepted registration named :query_class would entirely replace
  # the gated builtin — skipping assert_class_accessible!, the
  # COLLSCAN preflight, validate_keys!, the field allowlist, etc.
  # Refusing the registration at the boundary keeps the trust
  # boundary on the side where the gates already live.
  if TOOL_DEFINITIONS.key?(sym)
    raise ArgumentError,
          "tool name #{sym.inspect} collides with a built-in tool. " \
          "Built-in names: #{TOOL_DEFINITIONS.keys.sort.join(", ")}. " \
          "Pick a non-colliding name; built-ins enforce security gates " \
          "(class accessibility, field allowlist, COLLSCAN preflight) " \
          "that a custom registration would otherwise bypass."
  end
  definition = {
    category: category_str,
    name: sym.to_s,
    description: description.to_s,
    parameters: parameters,
  }
  # When the caller declares an output_schema we attach it to
  # the definition so it surfaces in the MCP `tools/list`
  # response as `outputSchema`. Per MCP 2025-06-18, a tool
  # carrying `outputSchema` SHOULD also emit `structuredContent`
  # on the result — the dispatcher does this automatically by
  # mirroring the handler's data Hash.
  definition[:output_schema] = output_schema if output_schema

  REGISTRY_MUTEX.synchronize do
    @registry[sym] = {
      definition:    definition,
      permission:    permission,
      timeout:       timeout.to_i,
      handler:       handler,
      output_schema: output_schema,
      client_safe:   client_safe == true,
    }
  end
  notify_subscribers
  nil
end

.registered_tools_for(permission) ⇒ Array<Symbol>

Returns registered tool names that are accessible at the given permission level.

Parameters:

  • permission (Symbol)

    :readonly, :write, or :admin

Returns:



1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
# File 'lib/parse/agent/tools.rb', line 1310

def registered_tools_for(permission)
  hierarchy = { readonly: 0, write: 1, admin: 2 }
  agent_level = hierarchy[permission] || 0
  REGISTRY_MUTEX.synchronize do
    @registry.select { |_name, entry|
      required = hierarchy[entry[:permission]] || 0
      agent_level >= required
    }.keys
  end
end

.reset_registry!Object

Clear all custom registrations, restoring builtins-only state. Intended for test suites.



1162
1163
1164
1165
1166
# File 'lib/parse/agent/tools.rb', line 1162

def reset_registry!
  REGISTRY_MUTEX.synchronize { @registry.clear }
  notify_subscribers
  nil
end

.reset_subscribers!Object

Remove all subscribers. Intended for test suites; do not call in application code because it will silently disable list-changed notifications for every connected stream.



1193
1194
1195
1196
# File 'lib/parse/agent/tools.rb', line 1193

def reset_subscribers!
  REGISTRY_MUTEX.synchronize { @subscribers.clear }
  nil
end

.resolve_aggregation_field(class_name, field, force_no_pointer: false) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Resolve a wire-format field name to its MongoDB aggregation form. Pointer fields are auto-prefixed with _p_ when the local Parse model class declares the field as a :pointer. createdAt/updatedAt are passed through verbatim — Parse Server's /aggregate endpoint translates those.



4063
4064
4065
4066
4067
4068
4069
4070
4071
4072
4073
4074
4075
4076
4077
4078
# File 'lib/parse/agent/tools.rb', line 4063

def resolve_aggregation_field(class_name, field, force_no_pointer: false)
  s = field.to_s
  return s if s.start_with?("_p_")
  return s if %w[createdAt updatedAt _created_at _updated_at].include?(s)
  klass = (Parse::Model.find_class(class_name) rescue nil) ||
          (Parse::Model.const_get(class_name) rescue nil)
  return s unless klass && klass.respond_to?(:fields) && klass.respond_to?(:field_map)
  # Resolve the Ruby symbol to the Parse wire name (e.g., :author_id → :authorId).
  wire = (klass.field_map[s.to_sym] || s).to_s
  return wire if force_no_pointer
  if klass.fields[s.to_sym] == :pointer || klass.fields[s] == :pointer
    "_p_#{wire}"
  else
    wire
  end
end

.resolve_group_operation!(operation, value_field) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns [op_key, accumulator_expression]. op_key is the LLM-facing name normalized ("average" → "avg"). value_field is required for sum/avg/min/max — its absence is a validation error.



3981
3982
3983
3984
3985
3986
3987
3988
3989
3990
3991
3992
3993
3994
3995
3996
3997
3998
# File 'lib/parse/agent/tools.rb', line 3981

def resolve_group_operation!(operation, value_field)
  op_raw = (operation || "count").to_s
  unless GROUP_OPERATIONS.key?(op_raw)
    raise Parse::Agent::ValidationError,
          "operation #{operation.inspect} is invalid. Must be one of " \
          "#{GROUP_OPERATIONS.keys.uniq.join(', ')}."
  end
  op_key = op_raw == "average" ? "avg" : op_raw
  if op_key == "count"
    [op_key, { "$sum" => 1 }]
  else
    if value_field.nil? || value_field.to_s.strip.empty?
      raise Parse::Agent::ValidationError,
            "operation '#{op_key}' requires value_field"
    end
    [op_key, { GROUP_OPERATIONS[op_raw] => "$__VALUE__" }] # placeholder; substituted by build_group_pipeline
  end
end

.resolve_pointer_target(parent_klass, pointer_field) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Resolve a one-hop pointer field name on parent_klass to its target Parse class name. Accepts both snake_case Ruby names and camelCase wire names (mirrors walk_pointer_path!).



2559
2560
2561
2562
2563
2564
2565
2566
# File 'lib/parse/agent/tools.rb', line 2559

def resolve_pointer_target(parent_klass, pointer_field)
  refs = parent_klass.references
  return nil unless refs
  seg_str   = pointer_field.to_s
  camel_str = seg_str.include?("_") ? seg_str.gsub(/_([a-z])/) { Regexp.last_match(1).upcase } : seg_str
  target = refs[seg_str.to_sym] || refs[seg_str] || refs[camel_str.to_sym] || refs[camel_str]
  target&.to_s
end

.resolve_tenant_scope!(agent, class_name) ⇒ Hash?

Resolve the effective tenant scope for a class and agent.

Returns nil when no agent_tenant_scope is declared (back-compat) or the bypass condition is satisfied. Returns { field: Symbol, value: Object } when a scope should be enforced. Raises Parse::Agent::AccessDenied when the agent has no tenant binding (from: returns nil) on a scoped class.

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • class_name (String)

    the Parse class name

Returns:

Raises:



1615
1616
1617
# File 'lib/parse/agent/tools.rb', line 1615

def resolve_tenant_scope!(agent, class_name)
  Parse::Agent::MetadataRegistry.resolve_tenant_scope(class_name, agent)
end

.rewrite_pointer_columns!(obj, compressible) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Mutates obj in place: rewrites every _p_<field> key whose bare form is in compressible to the bare key, stripping the <ClassName>$ prefix from the value.



2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
# File 'lib/parse/agent/tools.rb', line 2821

def rewrite_pointer_columns!(obj, compressible)
  case obj
  when Hash
    rename_pairs = []
    obj.each do |k, v|
      ks = k.to_s
      if ks.start_with?("_p_") && v.is_a?(String)
        field = ks.sub(/\A_p_/, "")
        if compressible.key?(field) &&
           (m = POINTER_STORAGE_VALUE_RE.match(v)) &&
           m[1] == compressible[field]
          rename_pairs << [k, field, m[2]]
          next
        end
      end
      rewrite_pointer_columns!(v, compressible)
    end
    rename_pairs.each do |old_k, new_k, new_v|
      obj.delete(old_k)
      # Match the new key's type to the original to keep the
      # caller's expectations consistent (Symbol-keyed hashes
      # stay Symbol-keyed; String-keyed stay String-keyed).
      new_key = old_k.is_a?(Symbol) ? new_k.to_sym : new_k
      obj[new_key] = new_v
    end
  when Array
    obj.each { |v| rewrite_pointer_columns!(v, compressible) }
  end
end

.root_keys_excluding_operators(hash) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Like #keys_excluding_operators but returns the ROOT segment of each dotted key ("user.name" -> "user"), deduplicated. Used by stages whose introduced names are addressed by the downstream walker via split(".").first lookup.



2052
2053
2054
# File 'lib/parse/agent/tools.rb', line 2052

def root_keys_excluding_operators(hash)
  keys_excluding_operators(hash).map { |k| k.split(".").first }.uniq
end

.run_aggregation_for_group_tool!(agent, class_name:, pipeline:, tool:, apply_canonical_filter: true) ⇒ Hash

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Run the group/distinct pipeline through the same security gates as aggregate (class access, tenant scope, COLLSCAN preflight, timeout, hidden-class redaction) but skip the LLM-pipeline validator (we built the pipeline) and the projection walker (our pipeline references derived names like _id/value that are not class fields).

Returns:

  • (Hash)

    one of:

    • { rows: [...] } on success
    • { refused: true, reason: ..., ... } when COLLSCAN refused


4185
4186
4187
4188
4189
4190
4191
4192
4193
4194
4195
4196
4197
4198
4199
4200
4201
4202
4203
4204
4205
4206
4207
4208
4209
4210
4211
4212
4213
4214
4215
4216
4217
4218
4219
4220
4221
4222
4223
4224
4225
4226
4227
4228
4229
4230
4231
4232
4233
4234
# File 'lib/parse/agent/tools.rb', line 4185

def run_aggregation_for_group_tool!(agent, class_name:, pipeline:, tool:,
                                    apply_canonical_filter: true)
  scope = resolve_tenant_scope!(agent, class_name)
  scoped = apply_tenant_scope_to_pipeline(pipeline, scope)

  # TRACK-AGENT-7 split: per-agent filter is UNCONDITIONAL; canonical
  # filter is LLM-controllable via apply_canonical_filter:.
  scoped = apply_per_agent_filter_to_pipeline(scoped, class_name, agent: agent)

  # H3: apply the class's canonical filter (e.g. soft-delete exclusion)
  # so group_by / group_by_date / distinct exclude the same rows that
  # query_class and count_objects exclude by default. This matches the
  # behavior of the generic `aggregate` tool (see line ~2218).
  if apply_canonical_filter
    scoped = apply_canonical_filter_to_pipeline(scoped, class_name, agent: agent)
  end

  if Parse::Agent.refuse_collscan? &&
     !MetadataRegistry.allow_collscan?(class_name) &&
     (match_stage = scoped.first&.dig("$match"))&.any?
    refusal = collscan_preflight(agent, class_name, match_stage)
    return refusal if refusal
  end

  # Parse Server's REST aggregate endpoint does NOT enforce per-row
  # ACL — only the SDK's mongo-direct path applies the _rperm match
  # injection via Parse::ACLScope. So we must route through
  # mongo-direct for ANY scoped agent (session_token, acl_user,
  # acl_role), not just acl_user/acl_role. Master-key agents keep
  # the REST path because they've already opted out of ACL.
  use_direct = agent.respond_to?(:acl_scope?) && agent.acl_scope? &&
               defined?(Parse::MongoDB) && Parse::MongoDB.enabled?

  with_timeout(tool) do
    rows =
      if use_direct
        translated = Parse::Query.new(class_name).send(
          :translate_pipeline_for_direct_mongodb, scoped,
        )
        raw_rows = Parse::MongoDB.aggregate(class_name, translated, **agent.acl_scope_kwargs)
        raw_rows.map { |raw| Parse::MongoDB.convert_aggregation_document(raw) }
      else
        response = agent.client.aggregate_pipeline(class_name, scoped, **agent.request_opts)
        raise "#{tool} aggregation failed: #{response.error}" unless response.success?
        response.results
      end
    rows = redact_hidden_classes!(rows, agent: agent)
    { rows: rows }
  end
end

.scan_for_pointer_columns(obj, acc) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Walk obj collecting className observations into acc keyed by the bare field name (the _p_ prefix stripped). When a key has both _p_foo and foo shapes in the same hash, marks it as a collision so the rewrite pass skips it.



2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
# File 'lib/parse/agent/tools.rb', line 2774

def scan_for_pointer_columns(obj, acc)
  case obj
  when Hash
    # Identify in-hash collisions: both _p_foo and foo present.
    bare_keys = {}
    obj.each_key do |k|
      ks = k.to_s
      if ks.start_with?("_p_")
        bare = ks.sub(/\A_p_/, "")
        bare_keys[bare] ||= []
        bare_keys[bare] << :prefixed
      else
        bare_keys[ks] ||= []
        bare_keys[ks] << :bare
      end
    end
    obj.each do |k, v|
      ks = k.to_s
      if ks.start_with?("_p_") && v.is_a?(String) &&
         (m = POINTER_STORAGE_VALUE_RE.match(v))
        field = ks.sub(/\A_p_/, "")
        if bare_keys[field]&.include?(:bare)
          acc[field] = :collision
        else
          cur = acc[field]
          if cur == :collision
            # already poisoned
          elsif cur.is_a?(Hash)
            cur[m[1]] = true
          else
            acc[field] = { m[1] => true }
          end
        end
      else
        scan_for_pointer_columns(v, acc)
      end
    end
  when Array
    obj.each { |v| scan_for_pointer_columns(v, acc) }
  end
end

.sort_groups(pairs, sort_choice) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Sort a [[key, value], ...] list per the sort_choice.



4321
4322
4323
4324
4325
4326
4327
4328
4329
4330
# File 'lib/parse/agent/tools.rb', line 4321

def sort_groups(pairs, sort_choice)
  return pairs if sort_choice.nil?
  case sort_choice
  when "value_desc" then pairs.sort_by { |_, v| -sort_key_numeric(v) }
  when "value_asc"  then pairs.sort_by { |_, v|  sort_key_numeric(v) }
  when "key_desc"   then pairs.sort_by { |k, _| sort_key_for(k) }.reverse
  when "key_asc"    then pairs.sort_by { |k, _| sort_key_for(k) }
  else pairs
  end
end

.sort_key_for(value) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Stable comparable representation for mixed-type keys. nil sorts last (regardless of direction; callers reverse for desc).



4336
4337
4338
4339
4340
4341
4342
4343
# File 'lib/parse/agent/tools.rb', line 4336

def sort_key_for(value)
  case value
  when nil      then [1, ""]
  when Numeric  then [0, value]
  when String   then [0, value]
  else               [0, value.to_s]
  end
end

.sort_key_numeric(value) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4347
4348
4349
4350
4351
4352
4353
4354
4355
# File 'lib/parse/agent/tools.rb', line 4347

def sort_key_numeric(value)
  case value
  when Numeric then value
  when nil     then 0
  else
    n = Float(value) rescue 0
    n
  end
end

.stage_field_delta(stage) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Compute [introduced_fields, replaces_schema] for a single pipeline stage. The forward-pass walker uses this to evolve the available-fields state across stages.

replaces_schema: true means downstream stages can ONLY see the introduced fields (the source-class schema is gone); false means the introduced fields are added on top of whatever was already addressable.



1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
# File 'lib/parse/agent/tools.rb', line 1950

def stage_field_delta(stage)
  return [[], false] unless stage.is_a?(Hash) && !stage.empty?
  op, value = stage.first
  case op.to_s
  when "$project"
    return [[], false] unless value.is_a?(Hash)
    # Exclusion-only form (`{x: 0, y: 0}`, or `{_id: 0}` alone)
    # keeps every other source field — schema is NOT replaced.
    # Anything else (any inclusion or compute) is schema-replacing
    # to the named keys. `_id` exclusion paired with other
    # inclusions stays inclusion-mode (the canonical
    # `{name: 1, _id: 0}` shape).
    if project_is_exclusion_only?(value)
      [[], false]
    else
      [project_introduced_roots(value), true]
    end
  when "$group"
    return [["_id"], true] unless value.is_a?(Hash)
    [(["_id"] | keys_excluding_operators(value)).uniq, true]
  when "$bucket", "$bucketAuto"
    return [["_id"], true] unless value.is_a?(Hash)
    keys = ["_id"]
    output = value["output"] || value[:output]
    if output.is_a?(Hash)
      keys |= keys_excluding_operators(output)
    else
      # Both $bucket and $bucketAuto default to outputting a
      # `count` field when no explicit `output` is supplied.
      keys << "count"
    end
    [keys, true]
  when "$replaceRoot", "$replaceWith"
    # The new shape comes from an arbitrary expression. We can't
    # statically know its top-level keys without resolving
    # `$user`-style dereferences. Mark schema-replaced with no
    # known introduced fields — downstream stages can only
    # reference ALWAYS_KEEP_FIELDS via the source check if it's
    # still alive, which it isn't post-replace. Practical effect:
    # field references after $replaceRoot in an allowlisted class
    # will fail. Acceptable; replaceRoot is rarely followed by
    # further restricted reads.
    [[], true]
  when "$addFields", "$set"
    # Root-normalize the introduced names: `$addFields { "user.x": ... }`
    # should register `user` (the root the downstream walker
    # looks up via split(".").first), not the literal dotted path.
    # Mirrors the $project root-normalization.
    return [[], false] unless value.is_a?(Hash)
    [root_keys_excluding_operators(value), false]
  when "$lookup", "$graphLookup", "$unionWith"
    # $unionWith concatenates rather than embedding, so it has no
    # `as`; the safe-default `[]` from a nil as_name is correct
    # for that variant. $lookup and $graphLookup add the `as`
    # field as a new array on each existing doc.
    as_name = value.is_a?(Hash) ? (value["as"] || value[:as]) : nil
    [as_name ? [as_name.to_s] : [], false]
  when "$facet"
    [value.is_a?(Hash) ? value.keys.map(&:to_s) : [], true]
  when "$sortByCount"
    [%w[_id count], true]
  when "$count"
    # `$count: ""` is rejected by MongoDB server-side but we
    # don't want the empty string registered as an available
    # field — downstream `$match { "": ... }` would then short-
    # circuit on the empty-root guard rather than failing the
    # allowlist check cleanly. Treat as a no-op.
    [value.is_a?(String) && !value.empty? ? [value] : [], true]
  when "$unwind"
    # $unwind's `includeArrayIndex: "idx"` adds a new top-level
    # field. The unwound path itself is preserved (as scalars),
    # so this is schema-extending.
    if value.is_a?(Hash) && (idx = value["includeArrayIndex"] || value[:includeArrayIndex])
      [[idx.to_s], false]
    else
      [[], false]
    end
  when "$setWindowFields", "$fill"
    # Both stages declare new fields via an `output:` map
    # (`{output: { newField: {...} }}`). Root-normalize same as
    # $project and $addFields.
    out = value.is_a?(Hash) ? (value["output"] || value[:output]) : nil
    out.is_a?(Hash) ? [root_keys_excluding_operators(out), false] : [[], false]
  else
    [[], false]
  end
end

.stringify_export_value(value) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



4713
4714
4715
4716
4717
4718
4719
4720
4721
4722
# File 'lib/parse/agent/tools.rb', line 4713

def stringify_export_value(value)
  case value
  when nil           then ""
  when String        then value
  when Hash, Array   then value.to_json
  when Time, DateTime then value.iso8601
  when Date          then value.to_s
  else                    value.to_s
  end
end

.subscribe { ... } ⇒ Proc

Subscribe to registry-changed events. The block is invoked with no arguments after every register or reset_registry! call. Returns a Proc that, when called, deregisters the subscriber. Use to drive MCP notifications/tools/list_changed broadcasts from the transport layer.

The subscriber callback runs on the thread that triggered the mutation. Callbacks must be fast and non-blocking; long work belongs in a thread or queue that the callback posts to. Exceptions raised by a subscriber are caught and logged via Kernel#warn — one bad subscriber cannot break the registry or prevent other subscribers from firing.

Yields:

  • no arguments

Returns:

  • (Proc)

    call with no arguments to deregister.

Raises:

  • (ArgumentError)


1183
1184
1185
1186
1187
1188
# File 'lib/parse/agent/tools.rb', line 1183

def subscribe(&block)
  raise ArgumentError, "block required" unless block

  REGISTRY_MUTEX.synchronize { @subscribers << block }
  -> { REGISTRY_MUTEX.synchronize { @subscribers.delete(block) } }
end

.timeout_for(name) ⇒ Integer

Resolve the timeout for a tool (registered overlay wins over builtin table).

Parameters:

Returns:

  • (Integer)

    seconds



1234
1235
1236
1237
1238
1239
# File 'lib/parse/agent/tools.rb', line 1234

def timeout_for(name)
  sym = name.to_sym
  entry = REGISTRY_MUTEX.synchronize { @registry[sym] }
  return entry[:timeout] if entry
  TOOL_TIMEOUTS[sym] || DEFAULT_TIMEOUT
end

.validate_export_column_path!(path) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

NEW-TOOLS-8: validate an export_data column path against the identifier regex and the sensitive-prefix denylist. Refuses internal Parse-Server fields (_hashed_password, _session_token, authData.*, ACL/permission columns, etc.) at the parameter boundary, regardless of whether the export class declares an agent_fields allowlist.



4649
4650
4651
4652
4653
4654
4655
4656
4657
4658
4659
4660
4661
4662
4663
4664
4665
4666
4667
4668
4669
4670
4671
4672
4673
4674
4675
4676
4677
4678
4679
4680
4681
4682
# File 'lib/parse/agent/tools.rb', line 4649

def validate_export_column_path!(path)
  s = path.to_s
  unless EXPORT_COLUMN_PATH_RE.match?(s)
    raise Parse::Agent::ValidationError,
          "column path #{s.inspect} is invalid. Each path must start with a " \
          "letter and contain only letters, digits, underscores, and dots " \
          "(max 128 chars). Underscore-prefixed names (e.g. _hashed_password, " \
          "_session_token, _rperm) are not permitted."
  end
  # Per-segment underscore check: refuse "authData._provider" /
  # "x._hashed_password" where the root passes but a sub-segment
  # is an internal field.
  if s.include?(".")
    s.split(".").each do |segment|
      next if segment.empty?
      next if /\A[A-Za-z]/.match?(segment)
      raise Parse::Agent::ValidationError,
            "column path #{s.inspect} has an underscore-prefixed " \
            "segment (#{segment.inspect}). Each dotted path segment " \
            "must start with a letter."
    end
  end
  # Root-level sensitive-prefix denylist. Catches `_hashed_password`
  # in addition to obvious top-level entries like `authData`.
  root = s.split(".").first
  if EXPORT_DENIED_COLUMN_PREFIXES.include?(root)
    raise Parse::Agent::ValidationError,
          "column path #{s.inspect} references a denied field root " \
          "(#{root.inspect}). Internal Parse-Server columns " \
          "(_hashed_password, _session_token, authData, ACL, _rperm, _wperm, " \
          "etc.) cannot be exported through the agent surface."
  end
  s
end

.validate_group_date_interval!(interval) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



3950
3951
3952
3953
3954
3955
3956
3957
3958
# File 'lib/parse/agent/tools.rb', line 3950

def validate_group_date_interval!(interval)
  sym = interval.to_s.downcase.to_sym
  unless GROUP_DATE_INTERVALS.include?(sym.to_s)
    raise Parse::Agent::ValidationError,
          "interval #{interval.inspect} is invalid. Must be one of " \
          "#{GROUP_DATE_INTERVALS.join(', ')}."
  end
  sym
end

.validate_group_field!(field, name:) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Validate a field-name parameter against the identifier shape.



3933
3934
3935
3936
3937
3938
3939
3940
3941
3942
3943
3944
3945
3946
# File 'lib/parse/agent/tools.rb', line 3933

def validate_group_field!(field, name:)
  if field.nil? || field.to_s.strip.empty?
    raise Parse::Agent::ValidationError, "#{name} is required"
  end
  s = field.to_s
  # Allow the optional _p_ prefix (some LLMs may pass storage form)
  check = s.start_with?("_p_") ? s.sub(/\A_p_/, "") : s
  unless /\A[A-Za-z][A-Za-z0-9_]{0,127}\z/.match?(check)
    raise Parse::Agent::ValidationError,
          "#{name} #{field.inspect} is not a valid identifier. " \
          "Must start with a letter and contain only letters, digits, and underscores."
  end
  s
end

.validate_keys!(keys) ⇒ Array<String>?

Validate a keys: projection array before forwarding to Parse Server. Same identifier rules as validate_include! — entries must start with a letter and use only [A-Za-z0-9_.], max 128 chars. This explicitly refuses leading-underscore names like _hashed_password, _session_token, _email_verify_token, _perishable_token, _rperm, _wperm, and authData.* subkey-paths starting with _.

Closes NEW-TOOLS-5: in master-key deployments (default agent auth), Parse Server returns these internal fields when explicitly requested via keys:. The agent_fields allowlist intersection catches them on classes that declare an allowlist, but a class WITHOUT an allowlist would forward _hashed_password through. Refuse at the parameter-validation boundary so the field never reaches Parse Server in the projection list.

Parameters:

Returns:

  • (Array<String>, nil)

    the original array (stringified) when valid

Raises:



5216
5217
5218
5219
5220
5221
5222
5223
5224
5225
5226
5227
5228
5229
5230
5231
5232
5233
5234
5235
5236
5237
5238
5239
5240
5241
5242
5243
5244
5245
5246
5247
5248
5249
5250
5251
5252
5253
5254
5255
5256
5257
# File 'lib/parse/agent/tools.rb', line 5216

def validate_keys!(keys)
  return nil if keys.nil?

  unless keys.is_a?(Array)
    raise Parse::Agent::ValidationError,
          "keys must be an Array of field-name strings (got: #{keys.class})"
  end

  if keys.size > MAX_KEYS_FIELDS
    raise Parse::Agent::ValidationError,
          "keys array exceeds the #{MAX_KEYS_FIELDS}-field limit " \
          "(#{keys.size} entries provided). Omit keys: to fetch all " \
          "fields, or narrow the projection."
  end

  pattern = /\A[A-Za-z][A-Za-z0-9_.]{0,127}\z/
  keys.map do |entry|
    s = entry.to_s
    unless s.match?(pattern)
      raise Parse::Agent::ValidationError,
            "keys entry #{entry.inspect} is invalid. Each entry must start with a " \
            "letter and contain only letters, digits, underscores, and dots " \
            "(max 128 chars). Underscore-prefixed names (e.g. _hashed_password, " \
            "_session_token, _rperm) are not permitted."
    end
    # Additional per-segment check for dotted paths: refuse subkeys
    # that begin with an underscore (e.g. "authData._provider").
    # Parse's wire format has the path as path1.path2.path3 — every
    # segment must independently start with a letter.
    if s.include?(".")
      s.split(".").each do |segment|
        next if segment.empty?
        next if /\A[A-Za-z]/.match?(segment)
        raise Parse::Agent::ValidationError,
              "keys entry #{entry.inspect} has an underscore-prefixed " \
              "segment (#{segment.inspect}). Each dotted path segment must " \
              "start with a letter."
      end
    end
    s
  end
end

.validate_timezone!(tz) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Accept IANA tz names and fixed offsets. Tolerant: empty / nil returns nil (UTC). Rejects anything with characters outside the tz alphabet so the value can never smuggle a pipeline operator.



3965
3966
3967
3968
3969
3970
3971
3972
3973
3974
# File 'lib/parse/agent/tools.rb', line 3965

def validate_timezone!(tz)
  return nil if tz.nil? || tz.to_s.strip.empty?
  s = tz.to_s.strip
  unless /\A[A-Za-z_][A-Za-z_0-9+\-\/]{0,63}\z|\A[+\-]\d{2}:?\d{2}\z/.match?(s)
    raise Parse::Agent::ValidationError,
          "timezone #{tz.inspect} is invalid. Use an IANA name (e.g. 'America/New_York') " \
          "or a fixed offset (e.g. '+05:00')."
  end
  s
end

.walk_and_redact(obj, hidden, agent: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
# File 'lib/parse/agent/tools.rb', line 2647

def walk_and_redact(obj, hidden, agent: nil)
  case obj
  when Hash
    cn = obj["className"] || obj[:className]
    if cn
      cn_str = cn.to_s
      if hidden.include?(cn_str) ||
         (agent && !agent.class_filter_permits?(cn_str))
        if obj["className"]
          return { "className" => cn_str, "__redacted" => true }
        else
          return { className: cn_str, __redacted: true }
        end
      end
    end
    obj.each_with_object({}) do |(k, v), acc|
      # Process-level field floor: drop INTERNAL_FIELDS_DENYLIST keys
      # (sessionToken, _hashed_password, _session_token, _auth_data*,
      # _rperm/_wperm, etc.) from every hash node, regardless of class.
      # This is the credential-stripping layer that must hold even when
      # an operator has deliberately unhidden the containing class.
      # Refused at this depth-walking pass so nested includes / $lookup
      # results / pointer subdocuments are all covered uniformly.
      k_str = k.to_s
      next if Parse::PipelineSecurity::INTERNAL_FIELDS_DENYLIST.include?(k_str)
      next if Parse::PipelineSecurity::INTERNAL_FIELDS_PREFIX_DENYLIST.any? { |prefix| k_str.start_with?(prefix) }

      # Pointer-storage string values (`"<ClassName>$<objectId>"`) that
      # reference a hidden class are scrubbed by replacing the value
      # with a redacted placeholder regardless of the key name.
      #
      # The original guard only fired when the key started with `_p_`,
      # which is the Parse-on-Mongo storage column convention. A
      # pipeline that re-projects or groups such a column under an
      # arbitrary output key (e.g. `$project { "leak" => "$_p_secret" }`
      # or `$group { "_id" => "$_p_secret" }`) produces rows like
      # `{"leak" => "HiddenClass$abc123"}` where the key is NOT `_p_*`.
      # The fix: check EVERY string value against the regex and scrub
      # whenever the extracted class name is in the hidden set. Visible
      # class pointer strings are unaffected because `hidden.include?`
      # only fires for registered-hidden classes.
      if v.is_a?(String) &&
         (m = POINTER_STORAGE_VALUE_RE.match(v)) &&
         (hidden.include?(m[1]) ||
          (agent && !agent.class_filter_permits?(m[1])))
        acc[k] = if k.is_a?(String)
            { "className" => m[1], "__redacted" => true }
          else
            { className: m[1], __redacted: true }
          end
      else
        acc[k] = walk_and_redact(v, hidden, agent: agent)
      end
    end
  when Array
    obj.map { |v| walk_and_redact(v, hidden, agent: agent) }
  else
    obj
  end
end

.walk_pipeline_stage!(stage, permitted_fields:, agent: nil, source_permitted: nil, available: [], source_addressable: true) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Per-stage policy check. permitted_fields is the effective set at this stage's position (source ∪ available so far, or just available after a schema-replacing upstream stage). The source_permitted / available / source_addressable kwargs are forwarded from #walk_pipeline_with_state! so that sub-pipeline recursion ($facet branches, $lookup.pipeline) can spawn their own forward-passes with the right starting state. Defaults preserve backward compatibility for direct callers that pass only permitted_fields: (the existing single-stage test surface).



2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
# File 'lib/parse/agent/tools.rb', line 2120

def walk_pipeline_stage!(stage, permitted_fields:, agent: nil,
                         source_permitted: nil, available: [],
                         source_addressable: true)
  return unless stage.is_a?(Hash)
  stage.each do |op, value|
    case op.to_s
    when "$lookup", "$graphLookup", "$unionWith"
      target = value.is_a?(Hash) ?
        (value["from"] || value[:from] || value["coll"] || value[:coll]) : nil
      target_str = nil
      if target
        target_str = target.to_s
        # Hard structural denylist: underscore-prefixed collections
        # outside the four SDK system classes (_User/_Role/_Installation/
        # _Session) are Parse Server administrative state and never
        # reachable from an Agent pipeline, regardless of per-Agent
        # MetadataRegistry policy. Catches `from: "_Hooks"`,
        # `from: "_SCHEMA"`, `from: "_GraphQLConfig"`, etc.
        if target_str.start_with?("_") &&
           !Parse::PipelineSecurity::ALLOWED_UNDERSCORE_COLLECTIONS.include?(target_str)
          raise Parse::Agent::AccessDenied.new(target_str)
        end
        if Parse::Agent::MetadataRegistry.hidden?(target_str)
          raise Parse::Agent::AccessDenied.new(target_str)
        end
        # Per-agent class filter — refuse $lookup.from / $unionWith.coll /
        # $graphLookup.from targets outside the agent's allowlist. Without
        # this gate, an agent with `classes: { only: [Post] }` could write
        # `$lookup: { from: "_User" }` from inside an allowed Post pipeline
        # and pull cross-class data the operator deliberately excluded.
        if agent && !agent.class_filter_permits?(target_str)
          raise Parse::Agent::AccessDenied.new(
            target_str,
            "Pipeline target '#{target_str}' is outside this agent's classes: allowlist",
            kind: :class_filter
          )
        end
        # CLP gate for joined classes. A $lookup / $graphLookup /
        # $unionWith into a class whose CLP refuses `find` for the
        # agent's scope must be refused — the join would otherwise
        # surface rows the agent can't read independently. Master-
        # key bypasses (acl_permission_strings nil).
        if agent && agent.respond_to?(:acl_permission_strings)
          perms = agent.acl_permission_strings
          unless Parse::CLPScope.permits?(target_str, :find, perms)
            raise Parse::Agent::AccessDenied.new(
              target_str,
              "Pipeline target '#{target_str}' CLP refuses find for the agent's scope.",
              kind: :clp_denied,
            )
          end
        end
      end
      # NEW-TOOLS-4: re-derive permitted_fields for the JOINED class
      # so its agent_fields allowlist applies to sub-pipeline stages.
      # The previous behavior passed `permitted_fields: nil`, which
      # meant a class with a tight allowlist (e.g. `Patient`
      # declaring `agent_fields :name, :id`) silently leaked
      # `Patient.ssn` through `$lookup.pipeline: [{ $project: { ssn: 1 } }]`.
      # MetadataRegistry.field_allowlist already merges in
      # ALWAYS_KEEP_FIELDS (objectId / createdAt / updatedAt), so a
      # join can carry the standard envelope without further work.
      sub = value.is_a?(Hash) ? (value["pipeline"] || value[:pipeline]) : nil
      if sub
        # The lookup sub-pipeline runs against the FOREIGN class's
        # documents — fresh forward-pass state, foreign allowlist.
        # ALWAYS_KEEP_FIELDS gets merged into the foreign source
        # set via compute_source_allowlist_for, so a join can
        # carry the standard envelope without further work.
        sub_source = target_str ? compute_source_allowlist_for(target_str) : nil
        walk_pipeline_with_state!(
          Array(sub),
          source_permitted: sub_source,
          available: [],
          source_addressable: true,
          agent: agent,
        )
      end
    when "$facet"
      next unless value.is_a?(Hash)
      # Each $facet branch processes the SAME input docs that
      # arrived at the $facet stage — so each branch starts with
      # the pre-facet state and evolves independently. Without
      # the state pass, a branch that did $group → $match on the
      # synthetic accumulator would have failed the way the
      # top-level pipeline used to.
      value.each_value do |sub_pipeline|
        walk_pipeline_with_state!(
          Array(sub_pipeline),
          source_permitted: source_permitted,
          available: available.dup,
          source_addressable: source_addressable,
          agent: agent,
        )
      end
    when "$project"
      # Inclusion keys (`name: 1`) reference source fields →
      # check the key against the effective allowlist. Exclusion
      # keys (`x: 0`) are NOT reads — skip the check (the field
      # is being dropped on output, not accessed). Compute/rename
      # (`x: <expr>`) introduces a new key; walk the expression
      # for source references AND screen the key against the
      # internal-column denylist so a downstream `$match { x: ... }`
      # cannot be lured into reading a column whose name mirrors
      # an internal Parse Server field.
      next unless permitted_fields && value.is_a?(Hash)
      value.each do |field, expr|
        fname = field.to_s
        next if fname.empty? || fname.start_with?("$")
        if projection_is_inclusion?(expr)
          root = fname.split(".").first
          unless permitted_fields.include?(root)
            raise_allowlist_refusal!("field", fname, root, permitted_fields)
          end
        elsif projection_is_exclusion?(expr)
          next
        else
          assert_output_key_not_internal!(fname, "$project")
          check_expression_for_restricted_fields!(expr, permitted_fields)
        end
      end
    when "$addFields", "$set"
      # Output keys are new names — don't gate against the source
      # allowlist (they're not references). The expression VALUES
      # may reference source fields; walk those. Defense-in-depth:
      # refuse output names mirroring internal Parse Server columns
      # so a downstream stage cannot surface a misleading key.
      next unless permitted_fields && value.is_a?(Hash)
      value.each do |field, expr|
        assert_output_key_not_internal!(field.to_s, op.to_s)
        check_expression_for_restricted_fields!(expr, permitted_fields)
      end
    when "$unset"
      # Server-side unset cannot leak data, but the named field is
      # still part of the projection contract. Allow.
      next
    when "$replaceRoot", "$replaceWith"
      next unless permitted_fields && value.is_a?(Hash)
      source = value["newRoot"] || value[:newRoot] || value
      check_expression_for_restricted_fields!(source, permitted_fields)
    when "$group"
      # $group with a custom _id formula can read any field. When an
      # allowlist is in force we treat referenced fields the same as
      # projection inputs.
      next unless permitted_fields
      check_expression_for_restricted_fields!(value, permitted_fields)
    when "$match"
      # $match expressions can reference any field. Without the
      # field-name check, an attacker can run a boolean oracle on
      # +_hashed_password+ or any other +agent_hidden+ /
      # non-allowlisted column by bisecting via +$regex+ and
      # reading the row count delta. Apply the same field-name
      # restriction as projection stages.
      next unless permitted_fields && value.is_a?(Hash)
      check_match_keys_for_restricted_fields!(value, permitted_fields)
    when "$sort"
      next unless permitted_fields && value.is_a?(Hash)
      value.each_key do |field|
        fname = field.to_s
        next if fname.empty? || fname.start_with?("$")
        root = fname.split(".").first
        unless permitted_fields.include?(root)
          raise_allowlist_refusal!("sort field", fname, root, permitted_fields)
        end
      end
    when "$sortByCount"
      # $sortByCount's value is an expression (typically a
      # $-prefixed field reference like "$status", or a computed
      # expression Hash). Route it through the expression walker
      # so the grouping target is allowlist-checked — the prior
      # `value.is_a?(Hash)` guard silently skipped string values,
      # letting `$sortByCount: "$ssn"` bypass enforcement entirely.
      next unless permitted_fields
      check_expression_for_restricted_fields!(value, permitted_fields)
    when "$bucket", "$bucketAuto"
      next unless permitted_fields && value.is_a?(Hash)
      check_expression_for_restricted_fields!(value["groupBy"] || value[:groupBy], permitted_fields)
      output = value["output"] || value[:output]
      check_expression_for_restricted_fields!(output, permitted_fields) if output
    when "$unwind"
      # $unwind operates on a path. When an allowlist is in force,
      # the path's root segment must be permitted.
      next unless permitted_fields
      path = value.is_a?(Hash) ? (value["path"] || value[:path]) : value
      if path.is_a?(String) && path.start_with?("$")
        ref = path.sub(/\A\$/, "").split(".").first
        if ref && !ref.empty? && !permitted_fields.include?(ref)
          raise_allowlist_refusal!("unwind path", path, ref, permitted_fields)
        end
      end
    when "$redact"
      # $redact lets the pipeline reference arbitrary fields via
      # $$KEEP/$$PRUNE expressions. Defense-in-depth: walk it.
      next unless permitted_fields
      check_expression_for_restricted_fields!(value, permitted_fields)
    end
  end
end

.walk_pipeline_with_state!(pipeline, source_permitted:, available:, source_addressable:, agent:) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Forward-pass walker. Maintains two pieces of state across stages:

  • available — fields introduced by upstream stages (Array)
  • source_addressable — whether the source-class allowlist still applies. Flipped false by schema-replacing stages ($project, $group, $bucket*, $replaceRoot, $replaceWith, $facet, $sortByCount, $count). Each stage is walked with the effective permitted set computed at its position; afterward, the field-delta for that stage updates the state for downstream stages.


1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
# File 'lib/parse/agent/tools.rb', line 1903

def walk_pipeline_with_state!(pipeline, source_permitted:, available:, source_addressable:, agent:)
  return unless pipeline.is_a?(Array)
  pipeline.each do |stage|
    effective = effective_permitted_set(source_permitted, available, source_addressable)
    walk_pipeline_stage!(
      stage,
      permitted_fields: effective,
      agent: agent,
      source_permitted: source_permitted,
      available: available,
      source_addressable: source_addressable,
    )
    introduced, replaces = stage_field_delta(stage)
    if replaces
      available = introduced.dup
      source_addressable = false
    else
      available = (available | introduced)
    end
  end
end

.walk_pointer_path!(klass, segments, agent: nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
# File 'lib/parse/agent/tools.rb', line 2570

def walk_pointer_path!(klass, segments, agent: nil)
  current = klass
  segments.each do |seg|
    refs = current.respond_to?(:references) ? current.references : nil
    return unless refs
    # `references` is keyed by the Parse field name (camelCase). Accept
    # both forms: snake_case as Ruby methods are usually named, and
    # camelCase as it appears on the wire and in the schema.
    seg_str   = seg.to_s
    camel_str = seg_str.include?("_") ? seg_str.gsub(/_([a-z])/) { Regexp.last_match(1).upcase } : seg_str
    target = refs[seg_str.to_sym] || refs[seg_str] || refs[camel_str.to_sym] || refs[camel_str]
    return unless target
    # target is the Parse class name as a String
    target_name = target.to_s
    if Parse::Agent::MetadataRegistry.hidden?(target_name)
      raise Parse::Agent::AccessDenied.new(target_name)
    end
    # Per-agent allowlist: refuse pointer-include resolution that
    # crosses into a class outside the agent's `classes:` set. Without
    # this check, an agent with `classes: { only: [Post] }` could
    # follow `include: ["author.session"]` and pull `_User` + `_Session`
    # rows into the response — defeating the narrowing the operator
    # declared. Strict-hidden master-key-except classes still apply
    # the registry gate above; this is the per-agent overlay.
    if agent && !agent.class_filter_permits?(target_name)
      raise Parse::Agent::AccessDenied.new(
        target_name,
        "Pointer-include target '#{target_name}' is outside this agent's classes: allowlist",
        kind: :class_filter
      )
    end
    current = begin
        Parse::Model.find_class(target_name)
      rescue StandardError
        nil
      end
    return unless current
  end
end

Instance Method Details

#aggregate(agent, class_name:, pipeline:, rewrite_lookups: nil, compact_pointers: true, apply_canonical_filter: true, mongo_direct: AGGREGATE_DEFAULT_MONGO_DIRECT, **_kwargs) ⇒ Object



3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525
3526
3527
3528
3529
3530
3531
3532
3533
3534
3535
3536
3537
3538
3539
3540
3541
3542
3543
3544
3545
3546
3547
3548
3549
3550
3551
3552
3553
3554
3555
3556
3557
3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
3572
3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586
3587
3588
3589
3590
3591
3592
3593
3594
3595
3596
3597
3598
3599
3600
3601
3602
3603
3604
3605
3606
3607
3608
3609
3610
3611
3612
3613
3614
3615
3616
3617
3618
3619
3620
3621
3622
3623
3624
3625
3626
3627
3628
3629
3630
3631
3632
3633
3634
3635
3636
3637
3638
3639
3640
3641
3642
3643
3644
3645
3646
# File 'lib/parse/agent/tools.rb', line 3483

def aggregate(agent, class_name:, pipeline:, rewrite_lookups: nil, compact_pointers: true,
              apply_canonical_filter: true, mongo_direct: AGGREGATE_DEFAULT_MONGO_DIRECT,
              **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  # SECURITY: Validate pipeline BEFORE execution.
  # This blocks dangerous stages like $out, $merge, $function.
  PipelineValidator.validate!(pipeline)
  # And refuse any cross-class reference into a hidden class via
  # $lookup / $graphLookup / $unionWith.from, plus enforce the
  # class's agent_fields allowlist on projection-style stages
  # ($project, $addFields, $set, $unset, $replaceRoot). Without this
  # the top-level assert_class_accessible! check is bypassable.
  enforce_pipeline_access_policy!(class_name, pipeline, agent: agent)

  # Auto-rewrite LLM-style $lookup stages into Parse-on-Mongo column
  # form AFTER access policy has run on the LLM's original (logical)
  # names. MetadataRegistry.hidden? canonicalizes `User` -> `_User`
  # so the access check is correct regardless, but running the
  # rewrite after keeps the audit-log shape identical to the LLM's
  # submission. Uses fallback: :preserve, so stages without a
  # parse_reference-equipped target pass through unchanged.
  pipeline = Parse::LookupRewriter.auto_rewrite(
    pipeline, class_name: class_name, enabled: rewrite_lookups,
  )

  # Tenant scope enforcement: prepend a $match stage at index 0.
  # Done after pipeline validation so the injected stage doesn't
  # interfere with the validator's denylist walk.
  scope              = resolve_tenant_scope!(agent, class_name)
  scoped_pipeline    = apply_tenant_scope_to_pipeline(pipeline, scope)

  # Per-agent filter (declared via Parse::Agent.new(filters: ...)) is
  # UNCONDITIONAL — TRACK-AGENT-7 split. No LLM kwarg can drop it.
  scoped_pipeline = apply_per_agent_filter_to_pipeline(scoped_pipeline, class_name, agent: agent)

  # Canonical filter (per-class, declared via agent_canonical_filter).
  # Prepended as a $match stage so the pipeline starts from the
  # "valid state" subset of the class — closes the silently-suspect
  # counts gap where an LLM dropping to aggregate would otherwise
  # include soft-deleted / hidden rows that query_class excludes.
  # LLM-controllable via apply_canonical_filter:.
  if apply_canonical_filter
    scoped_pipeline = apply_canonical_filter_to_pipeline(scoped_pipeline, class_name, agent: agent)
  end

  # Auto-inject a terminal $limit so the LLM cannot accidentally pull
  # a high-cardinality group result into context. If the caller's
  # pipeline already ends with $limit / $count, we trust their bound.
  effective_pipeline, auto_limited = ensure_aggregate_terminal_limit(scoped_pipeline)

  # COLLSCAN pre-flight check (Feature 3):
  # Extract a leading $match stage as the implicit "where" for aggregations.
  # If the pipeline doesn't begin with $match, skip pre-flight — the caller
  # is doing a deliberate scan-then-reduce and refusing would be hostile.
  if Parse::Agent.refuse_collscan? &&
     !MetadataRegistry.allow_collscan?(class_name) &&
     (match_stage = effective_pipeline.first&.dig("$match"))&.any?

    refusal = collscan_preflight(agent, class_name, match_stage)
    return refusal if refusal
  end

  # Route either through direct MongoDB (default) or the Parse Server
  # REST aggregate endpoint. The direct path additionally runs the
  # pipeline through the SDK's field-reference rewriter, which is
  # what closes the `$cond` / `$expr` / `$switch` rewrite holes that
  # the server route would otherwise inherit from Parse Server.
  use_mongo_direct = mongo_direct &&
                     defined?(Parse::MongoDB) &&
                     Parse::MongoDB.enabled?

  # Parse Server's REST aggregate endpoint does NOT enforce per-row
  # ACL — only the SDK's mongo-direct path applies the `_rperm`
  # `$match` injection via Parse::ACLScope. For a scoped agent
  # (session_token / acl_user / acl_role), the caller's
  # `mongo_direct: false` would silently bypass the agent's
  # declared scope; auto-promote to mongo-direct so the ACLScope
  # enforcement runs. Master-key agents keep their REST path
  # (no ACL enforcement was claimed in the first place).
  if !use_mongo_direct && agent.respond_to?(:acl_scope?) && agent.acl_scope? &&
     defined?(Parse::MongoDB) && Parse::MongoDB.enabled?
    use_mongo_direct = true
  end

  with_timeout(:aggregate) do
    results = if use_mongo_direct
        translated = Parse::Query.new(class_name).send(
          :translate_pipeline_for_direct_mongodb, effective_pipeline,
        )
        # Forward the agent's auth posture into Parse::MongoDB.aggregate
        # so its ACLScope layer runs in the right mode. Without this,
        # ACLScope's public-fallback would emit a SECURITY banner on
        # every agent call AND inject an `_rperm`-only `$match` —
        # masking rows that the agent's own class/field/tenant/canonical
        # gates already authorize the agent to see. LLM-supplied auth
        # kwargs are NOT forwarded here (the tool signature swallows
        # unknown kwargs into `**_kwargs` which never propagate); the
        # posture comes entirely from the agent instance.
        raw_rows = Parse::MongoDB.aggregate(
          class_name, translated, **mongo_direct_auth_kwargs(agent),
        )
        # Parse Server REST aggregate returns docs with BSON-native
        # types already converted to Parse/JSON-friendly shapes. The
        # direct driver returns raw BSON, so run each row through the
        # same converter so the downstream code (redact, compact)
        # sees identical shapes regardless of route.
        raw_rows.map { |raw| Parse::MongoDB.convert_aggregation_document(raw) }
      else
        response = agent.client.aggregate_pipeline(
          class_name, effective_pipeline, **agent.request_opts,
        )
        unless response.success?
          raise "Aggregation failed: #{response.error}"
        end
        response.results
      end

    # Defense-in-depth: scrub nested objects whose className is hidden.
    # Catches server-side $lookup output that leaked through despite
    # enforce_pipeline_access_policy! (e.g., dynamic `from:` values).
    # This MUST run before compact_pointers! so that hidden-class
    # `_p_*` strings are scrubbed (to a `{className:, __redacted:}`
    # placeholder) and not silently surfaced into the pointer_classes
    # envelope map.
    results = redact_hidden_classes!(results, agent: agent)

    # Pointer-column compaction. Default-on: a typical aggregate over
    # a class with a high-cardinality pointer (e.g. author per row)
    # repeats the `_User$` className prefix on every value. Stripping
    # it and hoisting the class to a single envelope-level
    # `pointer_classes` map saves meaningful bytes per call without
    # losing information. Opt-out via `compact_pointers: false` when
    # the caller specifically needs the raw Parse-on-Mongo shape.
    pointer_map = compact_pointers ? compact_pointers!(results) : {}

    result = {
      class_name: class_name,
      pipeline_stages: pipeline.size,
      result_count: results.size,
      # Coerce to String here so the value lands in
      # `structuredContent` as a String (matching the
      # advertised output_schema `type: "string"`). Without the
      # `.to_s`, MCP clients validating structuredContent see a
      # Ruby Symbol pre-serialization and fail the type check;
      # downstream JSON serialization would convert it but the
      # client-side validator runs before that.
      route: (use_mongo_direct ? :mongo_direct : :parse_server).to_s,
      results: results,
    }
    result[:pointer_classes] = pointer_map if pointer_map.any?
    # Only surface the auto-limit hint when the cap actually fired
    # (the result hit the cap). A 1-row aggregation that happened to
    # not declare a terminal $limit doesn't benefit from the warning
    # and the hint text is ~200 bytes on every call.
    if auto_limited && results.size >= AGGREGATE_DEFAULT_LIMIT
      result[:auto_limited] = true
      result[:auto_limit]   = AGGREGATE_DEFAULT_LIMIT
      result[:hint]         = "Pipeline auto-bounded with $limit:#{AGGREGATE_DEFAULT_LIMIT} (no terminal $limit/$count supplied). " \
                              "Add an explicit { \"$limit\": N } stage at the end of your pipeline to control the cap, " \
                              "or call count_objects first to size the result before fetching rows."
    end
    result
  end
end

#call_method(agent, class_name:, method_name:, object_id: nil, arguments: nil, **_kwargs) ⇒ Hash

Call an agent-allowed method on a Parse class

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • class_name (String)

    the Parse class name

  • method_name (String)

    the name of the method to call

  • object_id (String, nil) (defaults to: nil)

    object ID for instance methods

  • arguments (Hash) (defaults to: nil)

    method arguments

Returns:

  • (Hash)

    method result



4821
4822
4823
4824
4825
4826
4827
4828
4829
4830
4831
4832
4833
4834
4835
4836
4837
4838
4839
4840
4841
4842
4843
4844
4845
4846
4847
4848
4849
4850
4851
4852
4853
4854
4855
4856
4857
4858
4859
4860
4861
4862
4863
4864
4865
4866
4867
4868
4869
4870
4871
4872
4873
4874
4875
4876
4877
4878
4879
4880
4881
4882
4883
4884
4885
4886
4887
4888
4889
4890
4891
4892
4893
4894
4895
4896
4897
4898
4899
4900
4901
4902
4903
4904
4905
4906
4907
4908
4909
4910
4911
4912
4913
4914
4915
4916
4917
4918
4919
4920
4921
4922
4923
4924
4925
4926
4927
4928
4929
4930
4931
4932
4933
4934
4935
4936
4937
4938
4939
4940
4941
4942
4943
4944
4945
4946
4947
4948
4949
4950
4951
4952
4953
4954
4955
4956
4957
4958
4959
4960
4961
4962
4963
4964
4965
4966
4967
4968
4969
4970
4971
4972
4973
4974
4975
4976
4977
4978
4979
4980
4981
4982
4983
4984
4985
4986
4987
4988
4989
4990
4991
4992
4993
4994
4995
4996
4997
4998
4999
5000
5001
5002
5003
5004
5005
5006
5007
5008
5009
5010
5011
5012
5013
5014
5015
5016
5017
5018
5019
5020
5021
5022
5023
5024
5025
5026
# File 'lib/parse/agent/tools.rb', line 4821

def call_method(agent, class_name:, method_name:, object_id: nil, arguments: nil, **_kwargs)
  # Bypass-fix (v4.1.x security): every other tool entry calls this
  # guard, but call_method previously skipped it — a hidden class that
  # also declared an agent_method could be reached here. The CLP `op:`
  # is wired AFTER we know the method's permission tier (below) so
  # we can pass the right operation to the gate; the first call here
  # checks class-visibility / class-allowlist only.
  assert_class_accessible!(class_name, agent: agent)
  # NEW-TOOLS-9: method_name and object_id format validation.
  assert_method_name!(method_name)
  assert_object_id!(object_id)
  klass = Parse::Model.find_class(class_name)
  raise "Class not found: #{class_name}" unless klass

  method_sym = method_name.to_sym

  # Check if method is agent-allowed
  unless klass.respond_to?(:agent_method_allowed?) && klass.agent_method_allowed?(method_sym)
    raise "Method '#{method_name}' is not agent-allowed on #{class_name}. " \
          "Only methods marked with agent_method, agent_readonly, agent_write, or agent_admin can be called."
  end

  # Check permission level
  unless klass.agent_can_call?(method_sym, agent.permissions)
    method_info = klass.agent_method_info(method_sym)
    required = method_info[:permission] || :readonly
    raise "Permission denied: '#{method_name}' requires #{required} permissions. " \
          "Current level: #{agent.permissions}"
  end

  # Per-instance `methods:` filter. Applied after declaration check,
  # tier check, and class-accessibility check — i.e. only as a final
  # narrowing of an otherwise permitted invocation. Filter cannot
  # elevate: a method not declared via `agent_method` is never
  # reachable regardless of the filter.
  if agent.method_filtered?(method_sym, class_name: class_name)
    raise Parse::Agent::MethodFiltered,
          "Method '#{class_name}.#{method_name}' is not enabled for this agent instance " \
          "(excluded by the configured methods: filter)."
  end

  method_info = klass.agent_method_info(method_sym)

  # Operator-level env-gate for write/admin methods invoked through
  # call_method. Mirrors the gate in Parse::Agent#execute that
  # protects the direct create/update/delete tools: even if the
  # agent permission level allows it, the env var must also be set.
  required_perm = method_info[:permission] || :readonly
  case required_perm
  when :write
    unless Parse::Agent.write_tools_enabled?
      raise Parse::Agent::AccessDenied.new(
              class_name,
              "Write methods are disabled. " \
              "Set PARSE_AGENT_ALLOW_WRITE_TOOLS=true on the server to call '#{class_name}.#{method_name}'.",
            )
    end
  when :admin
    unless Parse::Agent.schema_ops_enabled?
      raise Parse::Agent::AccessDenied.new(
              class_name,
              "Admin methods are disabled. " \
              "Set PARSE_AGENT_ALLOW_SCHEMA_OPS=true on the server to call '#{class_name}.#{method_name}'.",
            )
    end
  end

  # Class-Level Permissions gate for call_method. Maps the
  # method's declared permission tier onto the corresponding
  # CLP operation so a class whose CLP doesn't grant write /
  # delete to the agent's scope can't be reached via a
  # `:write` / `:admin` agent_method. `:readonly` methods are
  # checked against CLP `:find` because they're typically
  # query-style code paths. The developer's method body
  # remains responsible for any internal queries it runs —
  # this gate is the BOUNDARY check at the method-name level.
  if agent.respond_to?(:acl_permission_strings)
    clp_op =
      case required_perm
      when :admin   then :delete
      when :write   then :update
      else               :find
      end
    perms = agent.acl_permission_strings
    unless Parse::CLPScope.permits?(class_name.to_s, clp_op, perms)
      raise Parse::Agent::AccessDenied.new(
        class_name,
        "Class '#{class_name}' CLP refuses #{clp_op} (mapped from " \
        "#{required_perm} agent_method) for the agent's scope.",
        kind: :clp_denied,
      )
    end
  end
  args = arguments || {}
  args = args.transform_keys(&:to_sym) if args.is_a?(Hash)

  # Apply mass-assignment guards. Always-denied keys come first:
  # +_hashed_password+, +authData+, +_session_token+, +sessionToken+,
  # +ACL+, +objectId+, +id+, +username+, +_rperm+, +_wperm+,
  # +_perishable_token+, +_email_verify_token+, +createdAt+,
  # +updatedAt+, +className+, +__type+. These can never flow
  # through +call_method+ regardless of +permitted_keys+. Then if
  # the method declared +permitted_keys+, the remaining args are
  # intersected with that allowlist.
  if args.is_a?(Hash) && !args.empty?
    violated = args.each_key.select { |k| CALL_METHOD_DENIED_KEYS.include?(k) }
    unless violated.empty?
      raise Parse::Agent::ValidationError,
            "Method '#{method_name}' cannot accept arguments " \
            "#{violated.inspect}; these keys reference protected " \
            "Parse columns (password hashes, ACL, auth data, " \
            "identifiers, timestamps) and are never permitted via " \
            "call_method regardless of permitted_keys."
    end
    if method_info[:permitted_keys]
      permitted = method_info[:permitted_keys] + AGENT_METHOD_RESERVED_ARG_KEYS
      extra = args.each_key.reject { |k| permitted.include?(k) }
      unless extra.empty?
        raise Parse::Agent::ValidationError,
              "Method '#{method_name}' does not permit arguments " \
              "#{extra.inspect}. Permitted keys: " \
              "#{method_info[:permitted_keys].inspect}."
      end
    end
  end

  # Dry-run handling. Two paths:
  #
  # 1. The method declared +supports_dry_run: true+ — keep the
  #    +dry_run+ flag in +args+ so the method body can branch on
  #    it and produce its own preview (e.g., a structural diff,
  #    pre-flight counters). The method controls the contract.
  #
  # 2. The method DID NOT declare dry-run support but the caller
  #    asked for one anyway. Previously we refused with a
  #    ValidationError. The MCP-caller perspective on that
  #    refusal is "I can't safely preview anything that wasn't
  #    pre-blessed by the author," which is unhelpful — the gate
  #    layer can ALWAYS safely report what the call WOULD do
  #    (permission tier resolved, args validated, object resolved
  #    when needed) without invoking the method itself. We now
  #    return a structural preview envelope and the +dry_run+ arg
  #    is stripped before any execution path. The preview is
  #    explicitly flagged +supports_real_dry_run: false+ so a
  #    consumer knows the method body wasn't consulted and a real
  #    dry-run (with method-author logic) isn't available.
  dry_run_requested = args.key?(:dry_run) && args[:dry_run]
  if dry_run_requested && !method_info[:supports_dry_run]
    preview_args = args.reject { |k, _| k == :dry_run }
    # Best-effort object resolution: an instance method's dry-run
    # should fail loudly if the targeted object doesn't exist,
    # since the same lookup will fail at execution time.
    if method_info[:type] == :instance
      raise "object_id required for instance method '#{method_name}'" unless object_id
      obj_exists = !klass.find(object_id).nil?
      unless obj_exists
        raise "Object not found: #{class_name}##{object_id}"
      end
    end
    return {
      class_name:             class_name,
      method:                 method_name,
      object_id:              object_id,
      dry_run:                true,
      supports_real_dry_run:  false,
      would_call: {
        class:     class_name,
        method:    method_name,
        type:      method_info[:type]&.to_s,
        object_id: object_id,
        args:      preview_args,
      },
      note: "The method '#{class_name}.#{method_name}' did not declare supports_dry_run: true, so no method-side preview is available. " \
            "This response confirms the call would pass the permission/args/object gates the agent enforces; the method body was NOT invoked. " \
            "Remove dry_run to execute the operation for real.",
    }
  end

  # If the method didn't declare dry-run support and the caller
  # passed a falsy +dry_run+ (or sent the key without it being
  # truthy), strip it before forwarding — the method body has no
  # idea about +dry_run+ and would raise ArgumentError on the
  # unexpected kwarg.
  if args.key?(:dry_run) && !method_info[:supports_dry_run]
    args = args.reject { |k, _| k == :dry_run }
  end

  # Execute with timeout - user methods could be slow
  with_timeout(:call_method) do
    result = if method_info[:type] == :instance
        raise "object_id required for instance method '#{method_name}'" unless object_id
        obj = klass.find(object_id)
        raise "Object not found: #{class_name}##{object_id}" unless obj
        call_with_args(obj, method_sym, args, agent: agent)
      else
        call_with_args(klass, method_sym, args, agent: agent)
      end

    {
      class_name: class_name,
      method: method_name,
      object_id: object_id,
      result: serialize_result(result, agent: agent),
    }
  end
end

#count_objects(agent, class_name:, where: nil, apply_canonical_filter: true, **_kwargs) ⇒ Hash

Count objects in a Parse class

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • class_name (String)

    the Parse class name

  • where (Hash) (defaults to: nil)

    query constraints

Returns:

  • (Hash)

    count result



3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
# File 'lib/parse/agent/tools.rb', line 3078

def count_objects(agent, class_name:, where: nil, apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :count)
  # Tenant scope enforcement. TRACK-AGENT-7 split: per-agent filter is
  # UNCONDITIONAL, canonical filter is LLM-controllable.
  scope           = resolve_tenant_scope!(agent, class_name)
  effective_where = apply_tenant_scope_to_where(where, scope, class_name)
  effective_where = apply_per_agent_filter_to_where(effective_where, class_name, agent: agent)
  effective_where = apply_canonical_filter_to_where(effective_where, class_name, agent: agent) if apply_canonical_filter

  query = { limit: 0, count: 1 }

  translated_where = nil
  if effective_where && !effective_where.empty?
    translated_where = ConstraintTranslator.translate(effective_where, agent)
    query[:where] = translated_where.to_json
  end

  count =
    if agent.respond_to?(:acl_scope_requires_direct?) && agent.acl_scope_requires_direct?
      execute_count_via_direct(agent, class_name, where: translated_where)
    else
      response = agent.client.find_objects(class_name, query, **agent.request_opts)
      unless response.success?
        raise "Count failed: #{response.error}"
      end
      response.count
    end

  {
    class_name: class_name,
    count: count,
    constraints: effective_where || {},
  }
end

#definitions(allowed_tools, format: :openai, category: nil) ⇒ Array<Hash>

Get tool definitions for allowed tools, merging registered definitions.

Parameters:

  • allowed_tools (Array<Symbol>)

    list of tool names to include

  • format (Symbol) (defaults to: :openai)

    output format (:openai or :mcp)

  • category (String, Symbol, nil) (defaults to: nil)

    optional category filter applied AFTER the permission-based allowlist. Case-insensitive. When nil, no category filtering is applied (current behavior).

Returns:



1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
# File 'lib/parse/agent/tools.rb', line 1347

def definitions(allowed_tools, format: :openai, category: nil)
  # Build a merged definition map: builtins first, registered on top
  registered_defs = REGISTRY_MUTEX.synchronize do
    @registry.transform_values { |entry| entry[:definition] }
  end

  defs = allowed_tools.filter_map do |tool_name|
    sym = tool_name.to_sym
    registered_defs[sym] || TOOL_DEFINITIONS[sym]
  end

  if category
    want = category.to_s.downcase
    defs = defs.select { |d| d[:category].to_s.downcase == want }
  end

  case format
  when :mcp
    defs.map { |d| to_mcp_format(d) }
  else
    defs.map { |d| { type: "function", function: d } }
  end
end

#distinct(agent, class_name:, field:, where: nil, sort: nil, limit: nil, dry_run: false, apply_canonical_filter: true, **_kwargs) ⇒ Object

Return distinct values of a field. See TOOL_DEFINITIONS[:distinct].



3860
3861
3862
3863
3864
3865
3866
3867
3868
3869
3870
3871
3872
3873
3874
3875
3876
3877
3878
3879
3880
3881
3882
3883
3884
3885
3886
3887
3888
3889
3890
3891
3892
3893
3894
3895
3896
3897
3898
3899
3900
3901
3902
3903
3904
3905
3906
3907
3908
3909
3910
3911
3912
3913
3914
3915
3916
3917
3918
3919
3920
3921
3922
3923
3924
3925
3926
3927
# File 'lib/parse/agent/tools.rb', line 3860

def distinct(agent, class_name:, field:, where: nil, sort: nil, limit: nil,
             dry_run: false, apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  validated_field = validate_group_field!(field, name: :field)
  cap = clamp_group_limit(limit, default: DISTINCT_DEFAULT_LIMIT, max: DISTINCT_MAX_LIMIT)
  sort_choice = case sort.to_s
                when "asc", "desc" then sort.to_s
                when "", "none", nil then nil
                else raise Parse::Agent::ValidationError,
                           "Invalid sort #{sort.inspect}. Must be 'asc' or 'desc'."
                end

  assert_fields_in_allowlist!(class_name, [validated_field])
  assert_where_fields_in_allowlist!(class_name, where)

  formatted_field = resolve_aggregation_field(class_name, validated_field)
  pipeline = build_group_pipeline(
    where: where,
    group_field: formatted_field,
    flatten_arrays: false,
    accumulator_op: nil,           # distinct has no accumulator
    value_field: nil,
    agent: agent,
    operation: "distinct",
  )
  # Map distinct's asc/desc to the shared sort vocabulary used by
  # append_sort_limit! (which expects key_*/value_*). Distinct has
  # no value column to sort on.
  wire_sort = case sort_choice
              when "asc"  then "key_asc"
              when "desc" then "key_desc"
              end
  append_sort_limit!(pipeline, sort_choice: wire_sort, cap: cap, default_sort: nil)

  return dry_run_envelope(class_name: class_name, pipeline: pipeline, params: {
    field: validated_field, sort: sort_choice, limit: cap,
  }) if dry_run

  result = run_aggregation_for_group_tool!(
    agent,
    class_name: class_name,
    pipeline: pipeline,
    tool: :distinct,
    apply_canonical_filter: apply_canonical_filter,
  )
  return result if result.is_a?(Hash) && result[:refused]

  values = result[:rows].map { |row| row["objectId"] }
  pointer_class, paired = extract_pointer_class!(values.map { |v| [v, nil] })
  # H2: redact keys and suppress pointer_class when the resolved class
  # is agent_hidden (same pattern as group_by).
  pointer_class, paired = redact_hidden_pointer_groups!(pointer_class, paired, agent: agent)
  values = paired.map(&:first)
  truncated = values.size > cap
  values = values.first(cap) if truncated

  envelope = {
    class_name: class_name,
    field: validated_field,
    count: values.size,
    values: values,
  }
  envelope[:pointer_class] = pointer_class if pointer_class
  envelope[:sort]          = sort_choice   if sort_choice
  envelope[:truncated]     = true          if truncated
  envelope[:limit]         = cap
  envelope
end

#explain_query(agent, class_name:, where: nil, **_kwargs) ⇒ Hash

Explain a query's execution plan

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • class_name (String)

    the Parse class name

  • where (Hash) (defaults to: nil)

    query constraints

Returns:

  • (Hash)

    query explanation



4766
4767
4768
4769
4770
4771
4772
4773
4774
4775
4776
4777
4778
4779
4780
4781
4782
4783
4784
4785
4786
4787
4788
4789
4790
4791
4792
4793
4794
4795
4796
4797
4798
4799
4800
4801
4802
4803
4804
4805
4806
4807
# File 'lib/parse/agent/tools.rb', line 4766

def explain_query(agent, class_name:, where: nil, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  # No direct-MongoDB equivalent of Parse Server's REST explain
  # plan exists today, and routing this through master-key REST
  # under an acl_user/acl_role agent would silently bypass the
  # scope. Refuse explicitly; the LLM can fall back to running
  # the query through a session_token-bound or master-key agent
  # when explain output is genuinely needed.
  if agent.respond_to?(:acl_scope_requires_direct?) && agent.acl_scope_requires_direct?
    raise Parse::Agent::ValidationError,
          "explain_query is not available under acl_user / acl_role scope. " \
          "Parse Server's REST explain endpoint has no mongo-direct " \
          "equivalent, and routing the call through master-key REST " \
          "would bypass the agent's declared scope. Re-run the explain " \
          "from a session_token-bound or master-key agent."
  end
  query = { explain: true, limit: 1 }

  # TRACK-AGENT-7 split: per-agent filter is UNCONDITIONAL.
  # H3: apply canonical filter so the explain plan reflects the query
  # that query_class actually executes (same soft-delete / valid-state
  # predicate). Without this, explain_query and query_class could report
  # different index usage for classes with agent_canonical_filter.
  effective_where = apply_per_agent_filter_to_where(where, class_name, agent: agent)
  effective_where = apply_canonical_filter_to_where(effective_where, class_name, agent: agent)

  if effective_where && !effective_where.empty?
    query[:where] = ConstraintTranslator.translate(effective_where, agent).to_json
  end

  response = agent.client.find_objects(class_name, query, **agent.request_opts)

  unless response.success?
    raise "Explain failed: #{response.error}"
  end

  {
    class_name: class_name,
    constraints: where || {},
    explanation: response.result,
  }
end

#export_data(agent, class_name:, where: nil, keys: nil, include: nil, order: nil, limit: nil, skip: nil, pipeline: nil, columns: nil, format: "csv", row_cap: nil, **_kwargs) ⇒ Object



4411
4412
4413
4414
4415
4416
4417
4418
4419
4420
4421
4422
4423
4424
4425
4426
4427
4428
4429
4430
4431
4432
4433
4434
4435
4436
4437
4438
4439
4440
4441
4442
4443
4444
4445
4446
4447
4448
4449
4450
4451
4452
4453
4454
4455
4456
4457
4458
4459
4460
4461
4462
4463
4464
4465
4466
4467
4468
4469
4470
4471
4472
4473
# File 'lib/parse/agent/tools.rb', line 4411

def export_data(agent, class_name:, where: nil, keys: nil, include: nil,
                       order: nil, limit: nil, skip: nil, pipeline: nil,
                       columns: nil, format: "csv", row_cap: nil, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  # Tenant scope enforcement: resolve once here and pass to mode helpers.
  scope = resolve_tenant_scope!(agent, class_name)

  format_s = format.to_s.downcase
  unless %w[csv markdown table].include?(format_s)
    raise Parse::Agent::ValidationError,
          "format must be one of: csv, markdown, table (got #{format.inspect})"
  end

  effective_cap = if row_cap.nil?
      DEFAULT_EXPORT_ROW_CAP
    else
      [row_cap.to_i, MAX_EXPORT_ROW_CAP].min.tap do |c|
        raise Parse::Agent::ValidationError, "row_cap must be positive" if c < 1
      end
    end

  rows = if pipeline
      export_via_aggregate(agent, class_name: class_name, pipeline: pipeline, scope: scope)
    else
      export_via_query(agent, class_name: class_name, where: where, keys: keys,
                              include: include, order: order, limit: limit, skip: skip,
                              scope: scope)
    end

  available_rows = rows.size
  truncated      = available_rows > effective_cap
  rows           = rows.first(effective_cap) if truncated

  column_specs   = normalize_export_columns(columns, rows.first)
  headers        = column_specs.map { |spec| spec[:header] }
  extracted_rows = rows.map do |row|
    column_specs.map { |spec| stringify_export_value(extract_export_value(row, spec[:path])) }
  end

  output = case format_s
    when "csv"      then format_export_csv(headers, extracted_rows)
    when "markdown" then format_export_markdown(headers, extracted_rows)
    when "table"    then format_export_text_table(headers, extracted_rows)
    end

  result = {
    class_name: class_name,
    format:     format_s,
    headers:    headers,
    row_count:  extracted_rows.size,
    output:     output,
  }
  if truncated
    result[:truncated]      = true
    result[:available_rows] = available_rows
    result[:row_cap]        = effective_cap
    result[:hint]           = "Output truncated at row_cap=#{effective_cap} of #{available_rows} available rows. " \
                              "Narrow with where:/pipeline filters, or set row_cap: explicitly (max #{MAX_EXPORT_ROW_CAP}). " \
                              "For full exports of larger sets use the operator-facing rake mcp:tool[export_data,...] " \
                              "directly rather than reading the rows back through the LLM."
  end
  result
end

#get_all_schemas(agent, names: nil, prefix: nil, **_kwargs) ⇒ Hash

Get all schemas from the Parse server

Parameters:

Returns:

  • (Hash)

    formatted schema information



1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
# File 'lib/parse/agent/tools.rb', line 1399

def get_all_schemas(agent, names: nil, prefix: nil, **_kwargs)
  response = agent.client.schemas(agent.request_opts)

  unless response.success?
    raise "Failed to fetch schemas: #{response.error}"
  end

  # response.result is already the results array (Parse::Response extracts it)
  schemas = response.results

  # Filter out classes marked agent_hidden — those are denied at every
  # tool surface (see Parse::Agent::MetadataDSL#agent_hidden). This is
  # the catalog filter; the per-call denial happens in
  # assert_class_accessible! below.
  hidden = Parse::Agent::MetadataRegistry.hidden_class_names
  schemas = schemas.reject { |s| hidden.include?(s["className"]) } unless hidden.empty?

  # Per-agent `classes:` allowlist — omit catalog entries the agent
  # would be refused on. Without this, an agent constructed with
  # `classes: { only: [Post, Topic] }` would still see _User, _Role,
  # etc. in get_all_schemas and waste a tool call discovering that
  # those classes are gated. The agent's class-filter predicate
  # canonicalizes both directions (Ruby class constants in the
  # operator's only/except set vs `className` Strings from the
  # server response) so the comparison is symmetric.
  if agent && !agent.class_filter_only.nil?
    schemas = schemas.select { |s| agent.class_filter_permits?(s["className"]) }
  elsif agent && !agent.class_filter_except.nil?
    schemas = schemas.reject { |s| !agent.class_filter_permits?(s["className"]) }
  end

  # Caller-supplied filters (NEW-TOOLS-9). Applied AFTER hidden-class
  # filtering so an attacker cannot probe for hidden classes via
  # `names: ["_HiddenAdmin"]` — the entry has already been stripped.
  # `names:` is an exact-match set; `prefix:` is a case-sensitive
  # leading-substring filter. Both nil/empty means no filter.
  if names.respond_to?(:any?) && names.any?
    name_set = names.map(&:to_s)
    schemas  = schemas.select { |s| name_set.include?(s["className"]) }
  end
  if prefix.is_a?(String) && !prefix.empty?
    schemas = schemas.select { |s| s["className"].to_s.start_with?(prefix) }
  end

  # Enrich with local model metadata (descriptions, agent methods)
  enriched = MetadataRegistry.enriched_schemas(schemas, agent_permission: agent.permissions)

  ResultFormatter.format_schemas(enriched)
end

#get_object(agent, class_name:, object_id:, include: nil, apply_canonical_filter: true, **_kwargs) ⇒ Hash

Get a single object by ID

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • class_name (String)

    the Parse class name

  • object_id (String)

    the objectId

  • include (Array<String>) (defaults to: nil)

    pointer fields to include

Returns:

  • (Hash)

    the object data

Raises:



3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
# File 'lib/parse/agent/tools.rb', line 3121

def get_object(agent, class_name:, object_id:, include: nil,
               apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :get)
  assert_object_id!(object_id)
  # Resolve tenant scope early so we can verify after fetch.
  # We do NOT inject it into the query (there is no where: on a fetch-by-id)
  # — instead we verify the returned record post-fetch to avoid becoming
  # an oracle for "does this id exist in another tenant".
  scope = resolve_tenant_scope!(agent, class_name)

  # TRACK-AGENT-6 / TRACK-AGENT-7 split: per-agent filter is
  # UNCONDITIONAL (operator scoping cannot be dropped via the
  # LLM kwarg). Canonical filter is LLM-controllable via
  # `apply_canonical_filter:`; default true so the same
  # "valid state" subset query_class returns is honored on
  # direct fetch-by-id. A caller that legitimately needs to
  # fetch a soft-deleted / hidden row can pass
  # apply_canonical_filter: false.
  #
  # When ANY filter applies, the call is rewritten to a
  # find_objects with `where: { objectId: id, ...filter }` so
  # enforcement is server-side; the "not found" envelope covers
  # both "doesn't exist" and "filtered out", consistent with the
  # existing tenant-scope refusal pattern that prefers fail-
  # closed over an existence oracle.
  composed = nil
  composed = apply_per_agent_filter_to_where(composed, class_name, agent: agent)
  composed = apply_canonical_filter_to_where(composed, class_name, agent: agent) if apply_canonical_filter
  composed_filter = composed && !composed.empty? ? composed : nil

  query = {}
  include = validate_include!(include)
  assert_include_paths_accessible!(class_name, include, agent: agent)
  query[:include] = include.join(",") if include&.any?

  # Project to the agent_fields allowlist when one is declared
  allowlist = MetadataRegistry.field_allowlist(class_name)
  effective_keys = allowlist&.any? ? allowlist.dup : nil
  # keys-on-include auto-projection: when the parent's allowlist
  # includes a pointer field that the caller is dereferencing via
  # `include:`, narrow the included record to its agent_join_fields
  # (or agent_fields - agent_large_fields) so the LLM doesn't pay
  # for the entire row on the join.
  include_projection = apply_include_projection(class_name, effective_keys, include)
  effective_keys = include_projection[:effective_keys]
  truncated_includes = include_projection[:truncated]
  query[:keys] = effective_keys.join(",") if effective_keys&.any?

  if composed_filter
    # Compose the objectId match into the filter; a hit returns exactly
    # one row, a filtered-out match returns zero rows (treated as not-
    # found below, identical to the genuine missing-row case).
    combined_where =
      if composed_filter.is_a?(Hash) && composed_filter.key?("$and")
        { "$and" => composed_filter["$and"] + [{ "objectId" => object_id }] }
      else
        { "$and" => [composed_filter, { "objectId" => object_id }] }
      end
    translated_combined = ConstraintTranslator.translate(combined_where, agent)
    rows =
      if agent.respond_to?(:acl_scope_requires_direct?) && agent.acl_scope_requires_direct?
        execute_find_via_direct(
          agent, class_name,
          where: translated_combined, limit: 1,
          keys: effective_keys, include: include,
        )
      else
        find_query = query.merge(where: translated_combined.to_json, limit: 1)
        response = agent.client.find_objects(class_name, find_query, **agent.request_opts)
        unless response.success?
          raise Parse::Error, "Fetch failed: #{response.error}"
        end
        response.results
      end
    if rows.nil? || rows.empty?
      raise Parse::Error, "Object not found: #{class_name}##{object_id}"
    end
    response_result = rows.first
  elsif agent.respond_to?(:acl_scope_requires_direct?) && agent.acl_scope_requires_direct?
    # No per-agent filter, but acl_user/acl_role scope: route the
    # fetch through mongo-direct with a `where: {objectId: id}`.
    # The three-layer ACL simulation in Parse::MongoDB.aggregate
    # ensures the row is only returned when the agent's scope
    # permits it.
    where_id = ConstraintTranslator.translate({ "objectId" => object_id }, agent)
    rows = execute_find_via_direct(
      agent, class_name,
      where: where_id, limit: 1,
      keys: effective_keys, include: include,
    )
    if rows.nil? || rows.empty?
      raise Parse::Error, "Object not found: #{class_name}##{object_id}"
    end
    response_result = rows.first
  else
    response = agent.client.fetch_object(class_name, object_id, query: query, **agent.request_opts)

    unless response.success?
      # Raise structured Parse::Error so the agent's error dispatch
      # preserves the "not found" message on the wire. A bare
      # RuntimeError falls into the generic StandardError catch-all and
      # gets sanitized to "internal error" — hiding a documented
      # user-error condition behind the internal-error mask.
      if response.object_not_found?
        raise Parse::Error, "Object not found: #{class_name}##{object_id}"
      end
      raise Parse::Error, "Fetch failed: #{response.error}"
    end
    response_result = response.result
  end

  result = redact_hidden_classes!(response_result, agent: agent)

  # Post-fetch tenant scope verification. Raises AccessDenied rather than
  # "not found" to avoid being an oracle for cross-tenant id existence.
  assert_record_in_tenant_scope!(result, scope, class_name)

  ResultFormatter.format_object(class_name, result,
                                 truncated_include_fields: truncated_includes)
end

#get_objects(agent, class_name:, ids: nil, include: [], apply_canonical_filter: true, **_kwargs) ⇒ Hash

Batch-fetch multiple Parse objects by id in a single query. Prefer this over multiple get_object calls when dereferencing pointers.

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • class_name (String)

    the Parse class name

  • ids (Array<String>) (defaults to: nil)

    objectId values to fetch (max 50, dedup'd)

  • include (Array<String>) (defaults to: [])

    pointer fields to include/resolve

Returns:

  • (Hash)

    { class_name:, objects:, missing:, requested:, found: }

Raises:

  • (Parse::Agent::ValidationError)

    if class_name invalid, ids not an Array, any id has invalid format, or more than 50 unique ids are requested



3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
3279
3280
3281
3282
3283
3284
3285
3286
3287
3288
3289
3290
3291
3292
3293
3294
3295
3296
3297
3298
3299
3300
3301
3302
3303
3304
3305
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
# File 'lib/parse/agent/tools.rb', line 3252

def get_objects(agent, class_name:, ids: nil, include: [],
                apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :get)
  # Resolve tenant scope early — verified post-fetch (oracle-prevention).
  # TRACK-AGENT-1 / TRACK-AGENT-6 / TRACK-AGENT-7 fix: per-agent
  # filter is UNCONDITIONALLY applied here (the operator's
  # per-agent scoping must not be bypassable via batch fetch-by-
  # id), and the canonical filter is LLM-controllable via
  # `apply_canonical_filter:` (default true) to mirror
  # get_object's semantics. Previously this tool silently
  # ignored BOTH filters, letting an LLM enumerate hidden /
  # archived rows by feeding harvested objectIds.
  scope = resolve_tenant_scope!(agent, class_name)

  # nil ids is an error (required parameter); empty array is a valid empty result
  if ids.nil?
    raise Parse::Agent::ValidationError, "ids is required"
  end

  unless ids.is_a?(Array)
    raise Parse::Agent::ValidationError, "ids must be an Array of Strings"
  end

  # Short-circuit on empty array — no query needed
  if ids.empty?
    return {
      class_name: class_name,
      objects: {},
      missing: [],
      requested: 0,
      found: 0,
    }
  end

  unique_ids = ids.uniq

  if unique_ids.size > 50
    raise Parse::Agent::ValidationError,
          "ids exceeds the 50-object limit (#{unique_ids.size} unique ids). " \
          "For larger sets use query_class with an $in constraint."
  end

  # Validate each id format using the shared OBJECT_ID_RE so
  # +get_objects+ accepts the same set of identifiers as
  # +get_object+ — apps with custom-id schemes that use hyphens or
  # underscores should not see one entry point accept the id and
  # another reject it.
  unique_ids.each do |id|
    unless id.is_a?(String) && OBJECT_ID_RE.match?(id)
      raise Parse::Agent::ValidationError,
            "each id must match #{OBJECT_ID_RE.source} (got: #{id.inspect})"
    end
  end

  # Compose where: { objectId: { $in: ids } } AND per-agent filter
  # AND (optionally) class-level canonical filter via $and so all
  # three layers reach the server in one query. Then route through
  # ConstraintTranslator so snake_case keys are camelized to Parse
  # Server wire format (mirrors count_objects / query_class).
  base_in_where    = { "objectId" => { "$in" => unique_ids } }
  composed         = apply_per_agent_filter_to_where(base_in_where, class_name, agent: agent)
  composed         = apply_canonical_filter_to_where(composed, class_name, agent: agent) if apply_canonical_filter
  translated_where = ConstraintTranslator.translate(composed, agent)

  # Build query
  query = {
    where: translated_where.to_json,
    limit: unique_ids.size,
  }
  include = validate_include!(include)
  assert_include_paths_accessible!(class_name, include, agent: agent)
  query[:include] = include.join(",") if include&.any?

  # Apply agent_fields allowlist as keys projection
  allowlist = MetadataRegistry.field_allowlist(class_name)
  effective_keys = allowlist&.any? ? allowlist.dup : nil
  # keys-on-include auto-projection: narrow included records to the
  # joined class's agent_join_fields (or agent_fields - large) when
  # the parent's allowlist surfaces a pointer the caller is
  # dereferencing via `include:`.
  include_projection = apply_include_projection(class_name, effective_keys, include)
  effective_keys = include_projection[:effective_keys]
  truncated_includes = include_projection[:truncated]
  query[:keys] = effective_keys.join(",") if effective_keys&.any?

  with_timeout(:get_objects) do
    rows =
      if agent.respond_to?(:acl_scope_requires_direct?) && agent.acl_scope_requires_direct?
        # Feed the translated (already-composed) where into the
        # direct-route helper. The $in constraint composes with
        # the per-agent / canonical filter via the same $and the
        # REST path emits, so the ACL scope's ALSO-applied _rperm
        # $match continues to layer on top correctly.
        execute_find_via_direct(
          agent, class_name,
          where: translated_where, limit: unique_ids.size,
          keys: effective_keys, include: include,
        )
      else
        response = agent.client.find_objects(class_name, query, **agent.request_opts)
        unless response.success?
          raise "Batch fetch failed: #{response.error}"
        end
        response.results
      end

    results = redact_hidden_classes!(rows, agent: agent)

    # Post-fetch tenant scope verification: refuse the whole call if ANY
    # returned record is outside this agent's tenant scope. This prevents
    # an oracle where partial results could confirm cross-tenant id existence.
    if scope
      results.each { |rec| assert_record_in_tenant_scope!(rec, scope, class_name) }
    end

    objects_by_id = results.each_with_object({}) do |obj, h|
      oid = obj.is_a?(Hash) ? (obj["objectId"] || obj[:objectId]) : obj.id
      h[oid] = obj
    end

    missing = unique_ids.reject { |id| objects_by_id.key?(id) }

    envelope = {
      class_name: class_name,
      objects: objects_by_id,
      missing: missing,
      requested: unique_ids.size,
      found: objects_by_id.size,
    }
    if truncated_includes && !truncated_includes.empty?
      envelope[:truncated_include_fields] =
        truncated_includes.transform_values { |meta| meta[:dropped] }
    end
    envelope
  end
end

#get_sample_objects(agent, class_name:, limit: nil, **_kwargs) ⇒ Hash

Get sample objects from a class

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • class_name (String)

    the Parse class name

  • limit (Integer) (defaults to: nil)

    number of samples (default 5, max 20)

Returns:

  • (Hash)

    sample objects



3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
# File 'lib/parse/agent/tools.rb', line 3395

def get_sample_objects(agent, class_name:, limit: nil, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  # Tenant scope enforcement: inject scope into where so samples are
  # always from the agent's tenant.
  scope = resolve_tenant_scope!(agent, class_name)

  limit = [limit || 5, 20].min

  query = {
    limit: limit,
    order: "-createdAt",
  }

  # Build effective where: combining tenant scope, per-agent filter, and
  # canonical filter.
  # H3: apply the class's canonical filter so sample rows are from the
  # same "valid state" subset that query_class returns by default.
  # Without this, get_sample_objects would surface soft-deleted / hidden
  # rows that every other read tool excludes.
  # TRACK-AGENT-7 split: per-agent filter is UNCONDITIONAL.
  effective_where = scope ? { scope[:field].to_s => scope[:value] } : nil
  effective_where = apply_per_agent_filter_to_where(effective_where, class_name, agent: agent)
  effective_where = apply_canonical_filter_to_where(effective_where, class_name, agent: agent)
  translated_where = nil
  if effective_where && !effective_where.empty?
    translated_where = ConstraintTranslator.translate(effective_where, agent)
    query[:where] = translated_where.to_json
  end

  # Project to the agent_fields allowlist when one is declared
  allowlist = MetadataRegistry.field_allowlist(class_name)
  query[:keys] = allowlist.join(",") if allowlist&.any?

  rows =
    if agent.respond_to?(:acl_scope_requires_direct?) && agent.acl_scope_requires_direct?
      execute_find_via_direct(
        agent, class_name,
        where: translated_where, limit: limit, order: "-createdAt",
        keys: allowlist&.any? ? allowlist : nil,
      )
    else
      response = agent.client.find_objects(class_name, query, **agent.request_opts)
      unless response.success?
        raise "Sample query failed: #{response.error}"
      end
      response.results
    end

  # Redact nested objects whose className is hidden — catches
  # anything the catalog filter and include-path resolver missed.
  results = redact_hidden_classes!(rows, agent: agent)
  {
    class_name: class_name,
    sample_count: results.size,
    samples: results.map { |obj| ResultFormatter.format_object(class_name, obj)[:object] },
    note: "These are the #{results.size} most recently created objects",
  }
end

#get_schema(agent, class_name:, **_kwargs) ⇒ Hash

Get schema for a specific class

Parameters:

  • agent (Parse::Agent)

    the agent instance

  • class_name (String)

    the Parse class name

Returns:

  • (Hash)

    formatted schema information



2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
# File 'lib/parse/agent/tools.rb', line 2883

def get_schema(agent, class_name:, **_kwargs)
  assert_class_accessible!(class_name, agent: agent)
  response = agent.client.schema(class_name)

  unless response.success?
    raise "Failed to fetch schema for '#{class_name}': #{response.error}"
  end

  # Enrich with local model metadata (descriptions, agent methods)
  enriched = MetadataRegistry.enriched_schema(class_name, response.result, agent_permission: agent.permissions)

  ResultFormatter.format_schema(enriched)
end

#group_by(agent, class_name:, field:, operation: nil, value_field: nil, where: nil, flatten_arrays: false, sort: nil, limit: nil, dry_run: false, apply_canonical_filter: true, **_kwargs) ⇒ Object

Group records by a field and aggregate. See TOOL_DEFINITIONS[:group_by].



3690
3691
3692
3693
3694
3695
3696
3697
3698
3699
3700
3701
3702
3703
3704
3705
3706
3707
3708
3709
3710
3711
3712
3713
3714
3715
3716
3717
3718
3719
3720
3721
3722
3723
3724
3725
3726
3727
3728
3729
3730
3731
3732
3733
3734
3735
3736
3737
3738
3739
3740
3741
3742
3743
3744
3745
3746
3747
3748
3749
3750
3751
3752
3753
3754
3755
3756
3757
3758
3759
3760
3761
3762
3763
3764
3765
# File 'lib/parse/agent/tools.rb', line 3690

def group_by(agent, class_name:, field:, operation: nil, value_field: nil,
             where: nil, flatten_arrays: false, sort: nil, limit: nil,
             dry_run: false, apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  validated_field = validate_group_field!(field, name: :field)
  op_key, accumulator = resolve_group_operation!(operation, value_field)
  cap = clamp_group_limit(limit, default: GROUP_DEFAULT_LIMIT, max: GROUP_MAX_LIMIT)
  sort_choice = normalize_group_sort(sort)

  # Field-allowlist enforcement: the group field, the value field
  # (when present), and every key in `where:` must be within
  # agent_fields when an allowlist is declared. The aggregate
  # delegate's walker doesn't cover the where-keys here because
  # we translate them into $match ourselves.
  referenced = [validated_field]
  referenced << validate_group_field!(value_field, name: :value_field) if value_field
  assert_fields_in_allowlist!(class_name, referenced)
  assert_where_fields_in_allowlist!(class_name, where)

  formatted_group = resolve_aggregation_field(class_name, validated_field)
  formatted_value = value_field ? resolve_aggregation_field(class_name, validate_group_field!(value_field, name: :value_field)) : nil

  pipeline = build_group_pipeline(
    where: where,
    group_field: formatted_group,
    flatten_arrays: flatten_arrays,
    accumulator_op: accumulator,
    value_field: formatted_value,
    operation: op_key,
    agent: agent,
  )
  append_sort_limit!(pipeline, sort_choice: sort_choice, cap: cap, default_sort: nil)

  return dry_run_envelope(class_name: class_name, pipeline: pipeline, params: {
    field: validated_field, operation: op_key, value_field: value_field,
    flatten_arrays: flatten_arrays, sort: sort_choice, limit: cap,
  }) if dry_run

  result = run_aggregation_for_group_tool!(
    agent,
    class_name: class_name,
    pipeline: pipeline,
    tool: :group_by,
    apply_canonical_filter: apply_canonical_filter,
  )
  return result if result.is_a?(Hash) && result[:refused]

  # Parse Server's REST aggregate endpoint renames the $group _id field
  # to "objectId" in the response envelope, even when the value is a
  # plain string ("ios"), a pointer-storage string ("Class$id"), or a
  # document (date buckets). Read from "objectId"; nil there is a
  # legitimate "missing grouped value" and normalize_group_key handles it.
  groups = result[:rows].map { |row| [row["objectId"], row["value"]] }
  pointer_class, groups = extract_pointer_class!(groups)
  # H2: if the extracted pointer class is agent_hidden, redact the keys
  # so objectIds from the hidden class are not surfaced, and suppress
  # the pointer_class name from the envelope.
  pointer_class, groups = redact_hidden_pointer_groups!(pointer_class, groups, agent: agent)
  truncated = groups.size > cap
  groups = groups.first(cap) if truncated

  envelope = {
    class_name: class_name,
    field: validated_field,
    operation: op_key,
    group_count: groups.size,
    groups: groups.map { |k, v| { key: normalize_group_key(k), value: v } },
  }
  envelope[:value_field]    = value_field if value_field
  envelope[:pointer_class]  = pointer_class if pointer_class
  envelope[:flatten_arrays] = true if flatten_arrays
  envelope[:sort]           = sort_choice if sort_choice
  envelope[:truncated]      = true if truncated
  envelope[:limit]          = cap
  envelope
end

#group_by_date(agent, class_name:, field:, interval:, operation: nil, value_field: nil, where: nil, timezone: nil, sort: nil, limit: nil, dry_run: false, apply_canonical_filter: true, **_kwargs) ⇒ Object

Group records by a date field bucketed at an interval. See TOOL_DEFINITIONS[:group_by_date].



3769
3770
3771
3772
3773
3774
3775
3776
3777
3778
3779
3780
3781
3782
3783
3784
3785
3786
3787
3788
3789
3790
3791
3792
3793
3794
3795
3796
3797
3798
3799
3800
3801
3802
3803
3804
3805
3806
3807
3808
3809
3810
3811
3812
3813
3814
3815
3816
3817
3818
3819
3820
3821
3822
3823
3824
3825
3826
3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
3842
3843
3844
3845
3846
3847
3848
3849
3850
3851
3852
3853
3854
3855
3856
3857
# File 'lib/parse/agent/tools.rb', line 3769

def group_by_date(agent, class_name:, field:, interval:, operation: nil,
                  value_field: nil, where: nil, timezone: nil, sort: nil,
                  limit: nil, dry_run: false, apply_canonical_filter: true, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  validated_field    = validate_group_field!(field, name: :field)
  interval_sym       = validate_group_date_interval!(interval)
  op_key, accumulator = resolve_group_operation!(operation, value_field)
  cap                = clamp_group_limit(limit, default: GROUP_DEFAULT_LIMIT, max: GROUP_MAX_LIMIT)
  sort_choice        = normalize_group_sort(sort) || "key_asc"
  tz                 = validate_timezone!(timezone)

  referenced = [validated_field]
  referenced << validate_group_field!(value_field, name: :value_field) if value_field
  assert_fields_in_allowlist!(class_name, referenced)
  assert_where_fields_in_allowlist!(class_name, where)

  # Reject non-scalar field types that cannot meaningfully be grouped
  # by date. Pointer and array fields are registered in klass.fields;
  # relation fields (has_many through: :relation) are in klass.relations.
  # All three produce null-buckets silently in MongoDB, so we reject
  # them here before building the pipeline.
  gbd_klass = (Parse::Model.find_class(class_name) rescue nil) ||
              (Parse::Model.const_get(class_name) rescue nil)
  if gbd_klass && gbd_klass.respond_to?(:fields)
    field_type = gbd_klass.fields[validated_field.to_sym] || gbd_klass.fields[validated_field]
    is_relation = gbd_klass.respond_to?(:relations) &&
                  gbd_klass.relations.key?(validated_field.to_sym)
    if %i[pointer array].include?(field_type) || is_relation
      bad_type = field_type || :relation
      raise Parse::Agent::ValidationError,
            "group_by_date field '#{validated_field}' has type '#{bad_type}', " \
            "which cannot be grouped by date. Use a :date or :string scalar field."
    end
  end

  formatted_field = resolve_aggregation_field(class_name, validated_field, force_no_pointer: true)
  formatted_value = value_field ? resolve_aggregation_field(class_name, validate_group_field!(value_field, name: :value_field)) : nil

  date_expr = build_date_group_expression(formatted_field, interval_sym, tz)
  pipeline = build_group_pipeline(
    where: where,
    group_field: nil,
    group_expression: date_expr,
    flatten_arrays: false,
    accumulator_op: accumulator,
    agent: agent,
    value_field: formatted_value,
    operation: op_key,
  )
  append_sort_limit!(pipeline, sort_choice: sort_choice, cap: cap, default_sort: "key_asc")

  return dry_run_envelope(class_name: class_name, pipeline: pipeline, params: {
    field: validated_field, interval: interval_sym.to_s, operation: op_key,
    value_field: value_field, timezone: tz, sort: sort_choice, limit: cap,
  }) if dry_run

  result = run_aggregation_for_group_tool!(
    agent,
    class_name: class_name,
    pipeline: pipeline,
    tool: :group_by_date,
    apply_canonical_filter: apply_canonical_filter,
  )
  return result if result.is_a?(Hash) && result[:refused]

  groups = result[:rows].map { |row| [format_date_key(row["objectId"], interval_sym), row["value"]] }
  # Re-sort in Ruby for key-based sorts because date keys are
  # formatted strings post-fetch (the pipeline sorts on the raw
  # _id document/integer); for value-based sorts, the wire sort
  # already left rows in correct order.
  groups = sort_groups(groups, sort_choice) if sort_choice.start_with?("key_")
  truncated = groups.size > cap
  groups = groups.first(cap) if truncated

  envelope = {
    class_name: class_name,
    field: validated_field,
    interval: interval_sym.to_s,
    operation: op_key,
    group_count: groups.size,
    groups: groups.map { |k, v| { key: k, value: v } },
  }
  envelope[:value_field] = value_field if value_field
  envelope[:timezone]    = tz          if tz
  envelope[:sort]        = sort_choice
  envelope[:truncated]   = true        if truncated
  envelope[:limit]       = cap
  envelope
end

#list_tools(agent, category: nil, **_kwargs) ⇒ Object

Discovery: return a lightweight catalog of every tool this agent can execute, optionally filtered by category. Returns { tools: [{name, category, description}, ...], categories: {...} }. No input schemas, no permission tier — just enough to let an LLM decide which tool to drill into next via tools/list (which carries the full schemas).

The agent's allowed_tools allowlist is honored — list_tools never reveals a tool the caller's permission tier or tools: filter blocks.



2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
# File 'lib/parse/agent/tools.rb', line 2862

def list_tools(agent, category: nil, **_kwargs)
  defs = definitions(agent.allowed_tools, format: :openai, category: category)
  rows = defs.map do |entry|
    fn = entry[:function] || entry
    {
      name:        fn[:name],
      category:    fn[:category] || "custom",
      description: fn[:description],
    }
  end
  {
    tools:      rows,
    categories: BUILTIN_CATEGORIES,
  }
end

#query_class(agent, class_name:, where: nil, limit: nil, skip: nil, order: nil, keys: nil, include: nil, apply_canonical_filter: true, format: nil, **_kwargs) ⇒ Object



2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
# File 'lib/parse/agent/tools.rb', line 2920

def query_class(agent, class_name:, where: nil, limit: nil, skip: nil,
                       order: nil, keys: nil, include: nil,
                       apply_canonical_filter: true, format: nil, **_kwargs)
  assert_class_accessible!(class_name, agent: agent, op: :find)
  limit = [limit || Agent::DEFAULT_LIMIT, Agent::MAX_LIMIT].min

  # Tenant scope enforcement: resolve before any query building so that
  # the effective where (with scope injected) is what everything else sees.
  # TRACK-AGENT-7 split: per-agent filter is UNCONDITIONAL; canonical
  # filter remains LLM-controllable via apply_canonical_filter:.
  scope        = resolve_tenant_scope!(agent, class_name)
  effective_where = apply_tenant_scope_to_where(where, scope, class_name)
  effective_where = apply_per_agent_filter_to_where(effective_where, class_name, agent: agent)
  effective_where = apply_canonical_filter_to_where(effective_where, class_name, agent: agent) if apply_canonical_filter

  # COLLSCAN pre-flight check (Feature 3):
  # Only runs when refuse_collscan is enabled globally AND the class has
  # not opted out via agent_allow_collscan, AND where is non-empty.
  if effective_where && !effective_where.empty? &&
     Parse::Agent.refuse_collscan? &&
     !MetadataRegistry.allow_collscan?(class_name)

    refusal = collscan_preflight(agent, class_name, effective_where)
    return refusal if refusal
  end

  # Build query hash
  query = {}
  query[:limit] = limit
  query[:skip] = skip if skip && skip > 0
  query[:order] = order if order
  # Reconcile caller-supplied `keys:` with the model's agent_fields
  # allowlist. When an allowlist is declared, it is a security boundary
  # (not a hint) and caller-supplied keys must be intersected with it,
  # not replace it. Without this, the LLM passing
  # `keys: ["ssn", "name"]` against an allowlisted class bypasses the
  # field redaction. ALWAYS_KEEP_FIELDS (objectId / createdAt /
  # updatedAt) is always included so pointer dereferencing still works.
  # NEW-TOOLS-5: validate keys: against identifier regex before
  # the allowlist intersection. This refuses leading-underscore
  # names (_hashed_password, _session_token, _rperm, _wperm, etc.)
  # at the boundary, so they cannot leak through even on classes
  # WITHOUT an agent_fields allowlist.
  validated_keys = validate_keys!(keys)
  allowlist = MetadataRegistry.field_allowlist(class_name)
  caller_keys = validated_keys&.any? ? validated_keys : nil
  effective_keys =
    if allowlist && allowlist.any?
      permitted = allowlist.map(&:to_s) | MetadataRegistry::ALWAYS_KEEP_FIELDS
      caller_keys ? (caller_keys & permitted) : allowlist.map(&:to_s)
    else
      caller_keys
    end
  include = validate_include!(include)
  # Refuse include paths that resolve into a hidden class via
  # belongs_to / has_one reflection. Best-effort; the post-fetch
  # redact_hidden_classes! walker (below) catches anything we can't
  # resolve at request time.
  assert_include_paths_accessible!(class_name, include, agent: agent)
  # keys-on-include auto-projection. When the caller passed
  # `keys: ["user", ...] + include: ["user"]`, expand `keys` to
  # dotted-path projections of the joined class so Parse Server
  # narrows the included record to its agent_join_fields (or
  # agent_fields - agent_large_fields) instead of serializing the
  # entire row. Suppressed when the caller passes any `<pointer>.*`
  # dotted path themselves or when no `keys:` was passed.
  projection = apply_include_projection(class_name, effective_keys, include)
  effective_keys = projection[:effective_keys]
  truncated_includes = projection[:truncated]
  query[:keys] = effective_keys.join(",") if effective_keys&.any?
  query[:include] = include.join(",") if include&.any?

  # SECURITY: Constraint validation happens in ConstraintTranslator.translate
  # This blocks dangerous operators like $where, $function
  translated_where = nil
  if effective_where && !effective_where.empty?
    translated_where = ConstraintTranslator.translate(effective_where, agent)
    query[:where] = translated_where.to_json
  end

  # Validate `format:` early so a bad value fails at the boundary
  # before we round-trip to Parse Server. nil and "json" both mean
  # the default structured envelope.
  normalized_format = format&.to_s&.downcase
  if normalized_format && !QUERY_CLASS_FORMATS.include?(normalized_format)
    raise Parse::Agent::ValidationError,
          "format: must be one of #{QUERY_CLASS_FORMATS.inspect}, got #{format.inspect}"
  end

  with_timeout(:query_class) do
    results =
      if agent.respond_to?(:acl_scope_requires_direct?) && agent.acl_scope_requires_direct?
        # Auto-route through Parse::MongoDB.aggregate so ACLScope's
        # `_rperm` $match injection runs — REST find_objects has
        # no "act as role" affordance for acl_user/acl_role agents.
        execute_find_via_direct(
          agent, class_name,
          where: translated_where, limit: limit, skip: skip,
          order: order, keys: effective_keys, include: include,
        )
      else
        response = agent.client.find_objects(class_name, query, **agent.request_opts)
        unless response.success?
          raise "Query failed: #{response.error}"
        end
        response.results
      end
    # Defense-in-depth: scrub any embedded objects whose className
    # matches a hidden class, regardless of how they got into the
    # result (e.g., server-side $lookup output via raw include paths
    # we couldn't resolve through belongs_to reflection at request
    # time).
    results = redact_hidden_classes!(results, agent: agent)

    if normalized_format && normalized_format != "json"
      format_query_results_as(normalized_format, class_name, results)
    else
      ResultFormatter.format_query_results(class_name, results,
                                           limit: limit, skip: skip || 0,
                                           where: where, keys: keys,
                                           order: order, include: include,
                                           truncated_include_fields: truncated_includes)
    end
  end
end

#to_mcp_format(definition) ⇒ Object

Convert OpenAI format to MCP format. Includes outputSchema when the source definition carried one (registered tools that opted into structured output via Tools.register(..., output_schema:)).



1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
# File 'lib/parse/agent/tools.rb', line 1374

def to_mcp_format(definition)
  mcp = {
    name: definition[:name],
    description: definition[:description],
    inputSchema: definition[:parameters],
  }
  mcp[:outputSchema] = definition[:output_schema] if definition[:output_schema]
  # Surface the tool's category as MCP `_meta`. The 2025-06-18 spec
  # allows `_meta` on tool descriptors for server-specific
  # extensions; older clients ignore unknown fields. Lets
  # spec-compliant consumers filter locally without server-side
  # support, in addition to the non-standard `params.category`
  # accepted by `tools/list`.
  mcp[:_meta] = { category: definition[:category] } if definition[:category]
  mcp
end