Module: Parse::LookupRewriter

Defined in:
lib/parse/lookup_rewriter.rb

Overview

Translate “LLM-style” MongoDB ‘$lookup` stages – expressed against logical Parse class names and pretty field names – into the column-name form that Parse Server actually uses in MongoDB.

An LLM trained on standard MongoDB syntax will produce a lookup like:

{ "$lookup" => { "from" => "Project", "localField" => "project",
                 "foreignField" => "_id", "as" => "project_doc" } }

That never matches anything, because Parse stores the join column as ‘_p_project` (containing the pointer string `“Project$abc123”`) and the foreign `_id` is just `“abc123”`. When the foreign class declares `parse_reference`, the column `parseReference` mirrors the pointer-string form, so the join collapses to a single-field equality:

{ "$lookup" => { "from" => "Project", "localField" => "_p_project",
                 "foreignField" => "parseReference", "as" => "project_doc" } }

When the foreign class does NOT declare ‘parse_reference`, the rewriter falls back to the `let`/`pipeline`/`$split` form that extracts the objectId from `p*` and matches it against the foreign `_id`.

Design

The rewriter is intentionally a stand-alone helper, not auto-wired into ‘Parse::Query#aggregate` or `Parse::MongoDB.aggregate`. Existing SDK code writes `$lookup` against `p*`/`parseReference` directly and silently rewriting those would corrupt them. The intended consumer is the LLM tool dispatcher (`Parse::Agent::Tools.aggregate`) where pipelines are generated by a model that doesn’t know Parse’s storage layout.

What is rewritten

For each ‘$lookup` stage with `localField` + `foreignField`:

  1. **Forward join** (local class has ‘belongs_to :foo`):

    localField: "foo" -> "_p_foo"
    foreignField: "_id"|"objectId" -> "parseReference" (or $split fallback)
    
  2. **Reverse join** (foreign class has ‘belongs_to` pointing back at us):

    localField: "_id"|"objectId" -> "parseReference" (or $split fallback)
    foreignField: "<pointer_name>" -> "_p_<pointer_name>"
    
  3. **System class collection rename** (always applied):

    from: "User" -> "_User"  (also _Role, _Installation, _Session)
    
  4. **Sub-pipeline recursion**: ‘$lookup.pipeline`, `$unionWith.pipeline`, and `$facet.*` are recursively rewritten with the foreign class (or the original local class for `$facet`) as the new local context.

What is NOT rewritten

  • Stages already in ‘p*`/`parseReference` form (idempotency).

  • Lookups whose ‘localField` is neither a known belongs_to nor an identity alias matched by a reverse belongs_to.

  • Lookups in ‘let`/`pipeline` form without a `localField`/`foreignField` pair (those are constructed deliberately; only the `from` collection is renamed and the sub-pipeline is recursed).

  • Embedded-pointer-array joins – by user request, since the array entries already carry ‘__type`/`className` and don’t benefit from ‘parseReference`. These fall through naturally because the join field isn’t a belongs_to on either side.

Constant Summary collapse

SYSTEM_CLASS_MAP =

Logical-name -> Parse-on-Mongo collection-name aliases for the four system classes. The LLM will write ‘from: “User”`; Mongo wants `_User`.

{
  "User" => Parse::Model::CLASS_USER,
  "Installation" => Parse::Model::CLASS_INSTALLATION,
  "Role" => Parse::Model::CLASS_ROLE,
  "Session" => Parse::Model::CLASS_SESSION,
}.freeze
OBJECT_ID_ALIASES =

Foreign-field values that an LLM might write to mean “the object’s identity”. Either is accepted as input; the rewriter substitutes the appropriate Parse-on-Mongo column.

%w[_id objectId].freeze
PARSE_REFERENCE_REMOTE =

Parse-on-Mongo remote column name for the ‘parse_reference` DSL. Matches the default `field_map` entry produced by `property :parse_reference, :string, field: “parseReference”`.

"parseReference"

Class Method Summary collapse

Class Method Details

.auto_rewrite(pipeline, class_name:, enabled: nil) ⇒ Array<Hash>

Auto-rewrite a pipeline for one of the gem’s three aggregation entry points (‘Parse::Query#aggregate`, `Parse::MongoDB.aggregate`, `Parse::Agent::Tools.aggregate`). Resolves `class_name` to a `Parse::Object` subclass and forwards to rewrite with `fallback: :preserve` – the auto path only rewrites stages where the foreign class declares `parse_reference`, so SDK-generated pipelines (already in `p*`/`parseReference` form) and pipelines whose foreign class lacks `parse_reference` pass through unchanged.

Parameters:

  • pipeline (Array<Hash>)

    the caller-supplied pipeline

  • class_name (String, Symbol)

    the Parse class the aggregation runs against

  • enabled (Boolean, nil) (defaults to: nil)

    explicit override. ‘nil` (the default) reads `Parse.rewrite_lookups`.

Returns:

  • (Array<Hash>)

    rewritten pipeline, or the input unchanged if rewriting is disabled or the class can’t be resolved.



108
109
110
111
112
113
114
115
# File 'lib/parse/lookup_rewriter.rb', line 108

def auto_rewrite(pipeline, class_name:, enabled: nil)
  return pipeline unless pipeline.is_a?(Array)
  flag = enabled.nil? ? Parse.rewrite_lookups : enabled
  return pipeline unless flag
  klass = Parse::Model.find_class(class_name.to_s) rescue nil
  return pipeline unless klass
  rewrite(pipeline, local_class: klass, fallback: :preserve)
end

.build_forward_rewrite(spec, pointer_field, target_class, from_logical, from_collection) ⇒ Object

Local: { p<field>: “Foreign$abc” } Foreign: { parseReference: “Foreign$abc” } When parse_reference is declared on the foreign class -> direct equality. Otherwise -> let/pipeline with $split extracting the objectId.



235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
# File 'lib/parse/lookup_rewriter.rb', line 235

def build_forward_rewrite(spec, pointer_field, target_class, from_logical, from_collection)
  mongo_local = "_p_#{pointer_field}"
  if foreign_has_parse_reference?(target_class)
    replace_keys(spec,
                "from" => from_collection,
                "localField" => mongo_local,
                "foreignField" => PARSE_REFERENCE_REMOTE)
  else
    as_value = read_string(spec, "as")
    let_var = "rwLookupId_#{pointer_field}"
    spec_without_pair = drop_keys(spec, %w[localField foreignField pipeline let from])
    spec_without_pair["from"] = from_collection
    spec_without_pair["let"] = {
      let_var => { "$arrayElemAt" => [{ "$split" => ["$#{mongo_local}", { "$literal" => "$" }] }, 1] },
    }
    spec_without_pair["pipeline"] = [
      { "$match" => { "$expr" => { "$eq" => ["$_id", "$$#{let_var}"] } } },
    ]
    spec_without_pair["as"] = as_value if as_value
    spec_without_pair
  end
end

.build_reverse_rewrite(spec, pointer_field, local_class, from_logical, from_collection) ⇒ Object

Local: { parseReference: “Local$abc” } Foreign: { p<field>: “Local$abc” } When parse_reference is declared on the LOCAL class -> direct equality. Otherwise -> let/pipeline with $split on the foreign side.



261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
# File 'lib/parse/lookup_rewriter.rb', line 261

def build_reverse_rewrite(spec, pointer_field, local_class, from_logical, from_collection)
  mongo_foreign = "_p_#{pointer_field}"
  if foreign_has_parse_reference?(local_class)
    replace_keys(spec,
                "from" => from_collection,
                "localField" => PARSE_REFERENCE_REMOTE,
                "foreignField" => mongo_foreign)
  else
    as_value = read_string(spec, "as")
    let_var = "rwReverseId_#{pointer_field}"
    spec_without_pair = drop_keys(spec, %w[localField foreignField pipeline let from])
    spec_without_pair["from"] = from_collection
    spec_without_pair["let"] = { let_var => "$_id" }
    spec_without_pair["pipeline"] = [
      { "$match" => { "$expr" => {
        "$eq" => [
          { "$arrayElemAt" => [{ "$split" => ["$#{mongo_foreign}", { "$literal" => "$" }] }, 1] },
          "$$#{let_var}",
        ],
      } } },
    ]
    spec_without_pair["as"] = as_value if as_value
    spec_without_pair
  end
end

.canonical_collection_name(name) ⇒ Object



344
345
346
347
# File 'lib/parse/lookup_rewriter.rb', line 344

def canonical_collection_name(name)
  return nil if name.nil?
  SYSTEM_CLASS_MAP[name.to_s] || name.to_s
end

.drop_keys(spec, names) ⇒ Object



420
421
422
423
424
425
426
427
428
# File 'lib/parse/lookup_rewriter.rb', line 420

def drop_keys(spec, names)
  out = spec.dup
  names.each do |name|
    out.delete(name)
    out.delete(name.to_sym)
    out.delete(name.to_s)
  end
  out
end

.foreign_has_parse_reference?(klass) ⇒ Boolean

Returns:

  • (Boolean)


374
375
376
# File 'lib/parse/lookup_rewriter.rb', line 374

def foreign_has_parse_reference?(klass)
  klass.respond_to?(:_parse_reference_fields) && Array(klass._parse_reference_fields).any?
end

.has_key?(spec, name) ⇒ Boolean

Returns:

  • (Boolean)


396
397
398
# File 'lib/parse/lookup_rewriter.rb', line 396

def has_key?(spec, name)
  spec.key?(name) || spec.key?(name.to_sym) || spec.key?(name.to_s)
end

.match_original_key(spec, name) ⇒ Object

Find the actual key object (String or Symbol) the spec uses for ‘name`, so we can write back without changing the caller’s key style. Returns ‘name` itself if not present.



403
404
405
406
# File 'lib/parse/lookup_rewriter.rb', line 403

def match_original_key(spec, name)
  [name, name.to_sym, name.to_s].each { |k| return k if spec.key?(k) }
  name
end

.read_string(spec, name) ⇒ Object


Hash key utilities – preserve original string-vs-symbol key style




382
383
384
385
# File 'lib/parse/lookup_rewriter.rb', line 382

def read_string(spec, name)
  v = read_value(spec, name)
  v.nil? ? nil : v.to_s
end

.read_value(spec, name) ⇒ Object



387
388
389
390
391
392
393
394
# File 'lib/parse/lookup_rewriter.rb', line 387

def read_value(spec, name)
  return spec[name] if spec.key?(name)
  sym = name.to_sym
  return spec[sym] if spec.key?(sym)
  str = name.to_s
  return spec[str] if spec.key?(str)
  nil
end

.rename_collection_in_place!(out, from_logical, from_collection) ⇒ Object



430
431
432
433
434
435
436
437
438
# File 'lib/parse/lookup_rewriter.rb', line 430

def rename_collection_in_place!(out, from_logical, from_collection)
  return if from_logical.nil? || from_collection.nil? || from_collection == from_logical
  # Match either `from:` or `coll:` style key, preserving its string/symbol form.
  %w[from coll].each do |k|
    next unless out.key?(k) || out.key?(k.to_sym)
    key = match_original_key(out, k)
    out[key] = from_collection
  end
end

.rename_collection_only(spec, from_logical, from_collection) ⇒ Object



440
441
442
443
# File 'lib/parse/lookup_rewriter.rb', line 440

def rename_collection_only(spec, from_logical, from_collection)
  return spec unless from_logical && from_collection && from_collection != from_logical
  replace_keys(spec, "from" => from_collection)
end

.replace_keys(spec, replacements) ⇒ Object



408
409
410
411
412
413
414
415
416
417
418
# File 'lib/parse/lookup_rewriter.rb', line 408

def replace_keys(spec, replacements)
  out = spec.dup
  replacements.each do |name, value|
    key = match_original_key(spec, name)
    out.delete(name)
    out.delete(name.to_sym)
    out.delete(name.to_s)
    out[key] = value
  end
  out
end

.resolve_class(name) ⇒ Object


Class / field resolution




334
335
336
337
338
339
340
341
342
# File 'lib/parse/lookup_rewriter.rb', line 334

def resolve_class(name)
  return nil if name.nil? || name.to_s.empty?
  # `find_class` already accepts both alias and canonical forms (`User`
  # and `_User` both resolve `Parse::User`) via its `parse_class == "_#{str}"`
  # branch, so the SYSTEM_CLASS_MAP rename here is redundant on the
  # input side -- it's still applied separately to the rewritten `from:`
  # value via `rename_collection_in_place!`.
  Parse::Model.find_class(name.to_s)
end

.resolve_forward_pointer(local_class, local_field) ⇒ Object

Returns the matching pointer field SYMBOL on the local class for the given logical local-field name, or nil.



351
352
353
354
355
356
357
358
# File 'lib/parse/lookup_rewriter.rb', line 351

def resolve_forward_pointer(local_class, local_field)
  return nil unless local_class && local_class.respond_to?(:references)
  sym = local_field.to_sym
  return sym if local_class.references.key?(sym)
  camel = local_field.to_s.camelize(:lower).to_sym
  return camel if local_class.references.key?(camel)
  nil
end

.resolve_reverse_pointer(target_class, foreign_field, local_class) ⇒ Object

Returns the matching pointer field SYMBOL on the FOREIGN class for the given foreign-field name when that pointer points back at local_class, or nil.



363
364
365
366
367
368
369
370
371
372
# File 'lib/parse/lookup_rewriter.rb', line 363

def resolve_reverse_pointer(target_class, foreign_field, local_class)
  return nil unless target_class && target_class.respond_to?(:references)
  return nil unless local_class.respond_to?(:parse_class)
  candidates = [foreign_field.to_sym, foreign_field.to_s.camelize(:lower).to_sym].uniq
  candidates.each do |sym|
    klass_name = target_class.references[sym]
    return sym if klass_name && klass_name == local_class.parse_class
  end
  nil
end

.rewrite(pipeline, local_class:, fallback: :split) ⇒ Array<Hash>

Walk a top-level pipeline and return a rewritten copy. Non-Array inputs are returned untouched.

Parameters:

  • pipeline (Array<Hash>)

    aggregation pipeline

  • local_class (Class<Parse::Object>)

    the class the outer aggregation is running against. Used to resolve forward ‘belongs_to` pointer fields and to compute its own `parseReference` for reverse joins.

  • fallback (Symbol) (defaults to: :split)

    what to do when a lookup is rewriteable in shape (matches a ‘belongs_to`) but the target class lacks `parse_reference`:

    • ‘:split` (default) – emit the `let`/`pipeline`/`$split` form. Always produces a working join.

    • ‘:preserve` – leave the stage alone. Use when the caller wants the rewriter to act only as an optimization, not as a correction. This is the mode used by the gem’s auto-wired paths so SDK-generated pipelines (which the rewriter shouldn’t second-guess) survive untouched.

Returns:

  • (Array<Hash>)

    a new pipeline with eligible ‘$lookup` stages rewritten. The input is not mutated.



136
137
138
139
# File 'lib/parse/lookup_rewriter.rb', line 136

def rewrite(pipeline, local_class:, fallback: :split)
  return pipeline unless pipeline.is_a?(Array)
  pipeline.map { |stage| rewrite_stage(stage, local_class: local_class, fallback: fallback) }
end

.rewrite_facet(spec, local_class:, fallback: :split) ⇒ Object



314
315
316
317
318
319
# File 'lib/parse/lookup_rewriter.rb', line 314

def rewrite_facet(spec, local_class:, fallback: :split)
  return spec unless spec.is_a?(Hash)
  spec.each_with_object({}) do |(key, sub_pipeline), out|
    out[key] = rewrite(sub_pipeline, local_class: local_class, fallback: fallback)
  end
end

.rewrite_graph_lookup(spec, local_class:) ⇒ Object

‘$graphLookup` doesn’t accept a ‘pipeline:` form, only `from:`, `startWith:`, `connectFromField:`, `connectToField:`, `as:`, plus a few options. We rewrite only the collection name (system-class alias) here. Pointer-style `p*`/parseReference equality across the connect-fields would require knowing both fields are pointer columns on both sides – the typical $graphLookup use cases (recursive hierarchies over the same collection) don’t need it. Document this so callers using $graphLookup against tagged pointer columns supply the Parse-on-Mongo column names themselves.



171
172
173
174
175
176
177
# File 'lib/parse/lookup_rewriter.rb', line 171

def rewrite_graph_lookup(spec, local_class:)
  return spec unless spec.is_a?(Hash)
  from_logical = read_string(spec, "from")
  from_collection = canonical_collection_name(from_logical)
  Parse::PipelineSecurity.assert_collection_allowed!(from_collection)
  rename_collection_only(spec, from_logical, from_collection)
end

.rewrite_let_pipeline_form(spec, from_logical, from_collection, target_class, fallback = :split) ⇒ Object



321
322
323
324
325
326
327
328
# File 'lib/parse/lookup_rewriter.rb', line 321

def rewrite_let_pipeline_form(spec, from_logical, from_collection, target_class, fallback = :split)
  out = spec.dup
  rename_collection_in_place!(out, from_logical, from_collection)
  if has_key?(spec, "pipeline") && target_class
    out[match_original_key(spec, "pipeline")] = rewrite(read_value(spec, "pipeline"), local_class: target_class, fallback: fallback)
  end
  out
end

.rewrite_lookup(spec, local_class:, fallback: :split) ⇒ Object

Rewrite a single ‘$lookup` spec.



180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
# File 'lib/parse/lookup_rewriter.rb', line 180

def rewrite_lookup(spec, local_class:, fallback: :split)
  return spec unless spec.is_a?(Hash)
  from_logical = read_string(spec, "from")
  from_collection = canonical_collection_name(from_logical)
  Parse::PipelineSecurity.assert_collection_allowed!(from_collection)
  target_class = resolve_class(from_logical) || resolve_class(from_collection)

  # let/pipeline shape -- only fix collection name and recurse into the
  # sub-pipeline using the foreign class as its local context.
  if has_key?(spec, "pipeline") && !has_key?(spec, "localField")
    return rewrite_let_pipeline_form(spec, from_logical, from_collection, target_class, fallback)
  end

  local_field = read_string(spec, "localField")
  foreign_field = read_string(spec, "foreignField")
  return rename_collection_only(spec, from_logical, from_collection) unless local_field && foreign_field

  # Already in Parse-on-Mongo form -- leave untouched aside from the
  # system-class collection rename.
  if local_field.start_with?("_p_") || foreign_field == PARSE_REFERENCE_REMOTE
    return rename_collection_only(spec, from_logical, from_collection)
  end

  forward_field = resolve_forward_pointer(local_class, local_field)
  if forward_field && OBJECT_ID_ALIASES.include?(foreign_field)
    if foreign_has_parse_reference?(target_class)
      return build_forward_rewrite(spec, forward_field, target_class, from_logical, from_collection)
    elsif fallback == :split
      return build_forward_rewrite(spec, forward_field, target_class, from_logical, from_collection)
    else
      return rename_collection_only(spec, from_logical, from_collection)
    end
  end

  reverse_field = resolve_reverse_pointer(target_class, foreign_field, local_class)
  if reverse_field && OBJECT_ID_ALIASES.include?(local_field)
    if foreign_has_parse_reference?(local_class)
      return build_reverse_rewrite(spec, reverse_field, local_class, from_logical, from_collection)
    elsif fallback == :split
      return build_reverse_rewrite(spec, reverse_field, local_class, from_logical, from_collection)
    else
      return rename_collection_only(spec, from_logical, from_collection)
    end
  end

  rename_collection_only(spec, from_logical, from_collection)
end

.rewrite_stage(stage, local_class:, fallback: :split) ⇒ Object

Rewrite a single stage. Stages that are not ‘$lookup`/`$unionWith`/ `$facet` are returned unchanged. Sub-pipelines inside those three are rewritten recursively.



144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
# File 'lib/parse/lookup_rewriter.rb', line 144

def rewrite_stage(stage, local_class:, fallback: :split)
  return stage unless stage.is_a?(Hash)
  stage.each_with_object({}) do |(key, value), out|
    case key.to_s
    when "$lookup"
      out[key] = rewrite_lookup(value, local_class: local_class, fallback: fallback)
    when "$graphLookup"
      out[key] = rewrite_graph_lookup(value, local_class: local_class)
    when "$unionWith"
      out[key] = rewrite_union_with(value, local_class: local_class, fallback: fallback)
    when "$facet"
      out[key] = rewrite_facet(value, local_class: local_class, fallback: fallback)
    else
      out[key] = value
    end
  end
end

.rewrite_union_with(spec, local_class:, fallback: :split) ⇒ Object


Stage helpers




291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
# File 'lib/parse/lookup_rewriter.rb', line 291

def rewrite_union_with(spec, local_class:, fallback: :split)
  case spec
  when Hash
    from_logical = read_string(spec, "from") || read_string(spec, "coll")
    from_collection = canonical_collection_name(from_logical)
    Parse::PipelineSecurity.assert_collection_allowed!(from_collection)
    target_class = resolve_class(from_logical) || resolve_class(from_collection)
    out = spec.dup
    # Either form (from: or coll:) is valid for $unionWith; rename if it's a system class.
    rename_collection_in_place!(out, from_logical, from_collection)
    if has_key?(spec, "pipeline") && target_class
      out[match_original_key(spec, "pipeline")] = rewrite(read_value(spec, "pipeline"), local_class: target_class, fallback: fallback)
    end
    out
  when String
    canonical = canonical_collection_name(spec) || spec
    Parse::PipelineSecurity.assert_collection_allowed!(canonical)
    canonical
  else
    spec
  end
end