Class: Parse::GroupBy

Inherits:
Object
  • Object
show all
Defined in:
lib/parse/query.rb

Overview

Helper class for handling group_by aggregations with method chaining. Supports count, sum, average, min, max operations on grouped data. Can optionally flatten array fields before grouping to count individual array elements.

Direct Known Subclasses

SortableGroupBy

Instance Method Summary collapse

Constructor Details

#initialize(query, group_field, flatten_arrays: false, return_pointers: false, mongo_direct: false) ⇒ GroupBy

Returns a new instance of GroupBy.

Parameters:

  • query (Parse::Query)

    the base query to group

  • group_field (Symbol, String)

    the field to group by

  • flatten_arrays (Boolean) (defaults to: false)

    whether to flatten array fields before grouping

  • return_pointers (Boolean) (defaults to: false)

    whether to return Parse::Pointer objects for pointer values

  • mongo_direct (Boolean) (defaults to: false)

    whether to query MongoDB directly bypassing Parse Server



6163
6164
6165
6166
6167
6168
6169
6170
6171
# File 'lib/parse/query.rb', line 6163

def initialize(query, group_field, flatten_arrays: false, return_pointers: false, mongo_direct: false)
  @query = query
  @group_field = group_field
  @flatten_arrays = flatten_arrays
  @return_pointers = return_pointers
  @mongo_direct = mongo_direct
  @sort_target = nil    # nil | :key | :value | :size
  @sort_direction = nil # :asc | :desc
end

Instance Method Details

#average(field) ⇒ Hash Also known as: avg

Calculate average of a field for each group.

Examples:

Document.group_by(:category).average(:duration)
# => {"video" => 120.5, "audio" => 45.2}

Parameters:

  • field (Symbol, String)

    the field to average within each group.

Returns:

  • (Hash)

    a hash with group values as keys and averages as values.



6383
6384
6385
6386
6387
6388
6389
6390
# File 'lib/parse/query.rb', line 6383

def average(field)
  if field.nil? || !field.respond_to?(:to_s)
    raise ArgumentError, "Invalid field name passed to `average`."
  end

  formatted_field = @query.send(:format_aggregation_field, field)
  execute_group_aggregation("average", { "$avg" => "$#{formatted_field}" })
end

#countHash

Count the number of items in each group.

Examples:

Document.group_by(:category).count
# => {"image" => 45, "video" => 23, "audio" => 12}

Returns:

  • (Hash)

    a hash with group values as keys and counts as values.



6358
6359
6360
# File 'lib/parse/query.rb', line 6358

def count
  execute_group_aggregation("count", { "$sum" => 1 })
end

#listHash{Object => Array<Parse::Object>}

Note:

On the Parse REST /aggregate path there is no ACL/CLP/protectedFields enforcement — that endpoint is master-key-only. On the mongo-direct path the SDK's ACL $match runs before $group, and both ACL redaction and protectedFields stripping recurse into pushed arrays, so scoped agents get correctly filtered records. The Array recursion that makes this safe lives in Parse::ACLScope#redact_subdocs! (lib/parse/acl_scope.rb) and Parse::CLPScope#walk_and_delete! (lib/parse/clp_scope.rb); if you change either of those, re-verify .list still strips correctly.

Collect every document of each group into an array of Parse::Object instances. Implemented as $push: "$$ROOT", so each group's value is the full set of underlying records (subject to the query's where constraints).

Use this when you want the actual records per group, not just an aggregated scalar. Combine with .order(size: :desc) to surface the largest groups first.

Examples:

Document.where(:status => "active").group_by(:category).list
# => {"image" => [<Document:...>, <Document:...>], "video" => [<Document:...>]}

Largest groups first

Document.group_by(:category).order(size: :desc).list

Returns:



6442
6443
6444
6445
6446
6447
6448
6449
6450
6451
6452
6453
6454
6455
6456
6457
6458
6459
6460
6461
6462
# File 'lib/parse/query.rb', line 6442

def list
  table = @query.instance_variable_get(:@table)
  # `$push: "$$ROOT"` pushes the raw MongoDB-storage-format document
  # into the result array on BOTH the REST and mongo-direct paths —
  # Parse Server's aggregate envelope only rewrites the outermost row's
  # `_id` to `objectId`, not nested arrays. So `_id`, `_p_<field>`
  # pointer strings, `_acl`/`_rperm`/`_wperm`, and `_created_at`/
  # `_updated_at` all survive into the pushed docs and have to be
  # normalized to Parse shape before `Parse::Object.build` will produce
  # an instance with the right id, associations, ACL, and timestamps.
  require_relative "mongodb"
  build_object = lambda do |doc|
    parse_doc = Parse::MongoDB.convert_document_to_parse(doc, table)
    parse_doc ? Parse::Object.build(parse_doc, table) : nil
  end

  execute_group_aggregation("list", { "$push" => "$$ROOT" }) do |docs|
    next [] unless docs.is_a?(Array)
    docs.map(&build_object).compact
  end
end

#max(field) ⇒ Hash

Find maximum value of a field for each group.

Parameters:

  • field (Symbol, String)

    the field to find maximum for within each group.

Returns:

  • (Hash)

    a hash with group values as keys and maximum values as values.



6409
6410
6411
6412
6413
6414
6415
6416
# File 'lib/parse/query.rb', line 6409

def max(field)
  if field.nil? || !field.respond_to?(:to_s)
    raise ArgumentError, "Invalid field name passed to `max`."
  end

  formatted_field = @query.send(:format_aggregation_field, field)
  execute_group_aggregation("max", { "$max" => "$#{formatted_field}" })
end

#min(field) ⇒ Hash

Find minimum value of a field for each group.

Parameters:

  • field (Symbol, String)

    the field to find minimum for within each group.

Returns:

  • (Hash)

    a hash with group values as keys and minimum values as values.



6397
6398
6399
6400
6401
6402
6403
6404
# File 'lib/parse/query.rb', line 6397

def min(field)
  if field.nil? || !field.respond_to?(:to_s)
    raise ArgumentError, "Invalid field name passed to `min`."
  end

  formatted_field = @query.send(:format_aggregation_field, field)
  execute_group_aggregation("min", { "$min" => "$#{formatted_field}" })
end

#order(spec) ⇒ self

Order grouped results by the group key, the aggregated value, or (for #list) the size of the per-group array. The ordering is pushed down into the aggregation pipeline as a $sort stage (plus a $addFields helper for :size), so MongoDB does the sort and the returned Hash preserves the order via Ruby's insertion semantics.

Examples:

Biggest groups first

Document.group_by(:category).order(value: :desc).count

Alphabetical group keys

Document.group_by(:category).order(key: :asc).count

Groups with the most members first

Document.group_by(:category).order(size: :desc).list

Parameters:

  • spec (Hash, Symbol)

    one of:

    • { key: :asc | :desc } — sort by the group key
    • { value: :asc | :desc } — sort by the aggregated value (count/sum/avg/min/max)
    • { size: :asc | :desc } — sort by the length of the pushed array (only meaningful with #list)
    • :asc or :desc — shorthand for { key: direction }, matching Ruby's Hash#sort default of sorting by key.

Returns:

  • (self)


6194
6195
6196
6197
6198
6199
6200
6201
6202
6203
6204
6205
6206
6207
6208
6209
6210
6211
6212
6213
6214
6215
6216
6217
6218
6219
# File 'lib/parse/query.rb', line 6194

def order(spec)
  target, direction =
    case spec
    when Symbol
      [:key, spec]
    when Hash
      unless spec.size == 1
        raise ArgumentError, "order(...) expects a single pair, e.g. {value: :desc} (got #{spec.inspect})"
      end
      k, v = spec.first
      [k.to_sym, v.to_sym]
    else
      raise ArgumentError, "order(...) expects {key:|value:|size: => :asc|:desc} or :asc/:desc (got #{spec.inspect})"
    end

  unless %i[key value size].include?(target)
    raise ArgumentError, "order(...) target must be :key, :value, or :size (got #{target.inspect})"
  end
  unless %i[asc desc].include?(direction)
    raise ArgumentError, "order(...) direction must be :asc or :desc (got #{direction.inspect})"
  end

  @sort_target = target
  @sort_direction = direction
  self
end

#pipelineArray<Hash>

Returns the MongoDB aggregation pipeline that would be used for a count operation. This is useful for debugging and understanding the generated pipeline.

Examples:

Post.where(:author_workspace.eq => workspace).group_by(:last_action).pipeline
# => [{"$match"=>{"authorWorkspace"=>"Workspace$abc123"}}, {"$group"=>{"_id"=>"$lastAction", "count"=>{"$sum"=>1}}}, {"$project"=>{"_id"=>0, "objectId"=>"$_id", "count"=>1}}]

Returns:

  • (Array<Hash>)

    the MongoDB aggregation pipeline



6246
6247
6248
6249
6250
6251
6252
6253
6254
6255
6256
6257
6258
6259
6260
6261
6262
6263
6264
6265
6266
6267
6268
6269
6270
6271
6272
6273
6274
6275
6276
6277
6278
6279
6280
6281
6282
6283
6284
6285
6286
6287
6288
6289
6290
6291
6292
6293
6294
6295
6296
6297
6298
6299
6300
6301
6302
6303
6304
6305
6306
6307
6308
6309
6310
6311
6312
6313
6314
6315
6316
6317
6318
6319
6320
6321
6322
6323
6324
6325
6326
6327
6328
6329
6330
6331
6332
6333
# File 'lib/parse/query.rb', line 6246

def pipeline
  # This introspection builds the same shape as the count execution
  # path (`$sum: 1`), so reject order/aggregation combinations that
  # the count path would reject at runtime — otherwise the preview
  # silently produces a pipeline the SDK would never actually run.
  validate_sort_target_for_operation!("count")

  # Format the group field name
  formatted_group_field = @query.send(:format_aggregation_field, @group_field)

  # Build the aggregation pipeline (same logic as execute_group_aggregation)
  pipeline = []

  # Add match stage if there are where conditions. `compile_where`
  # is already marker-free; use `compile_markers` to extract
  # __aggregation_pipeline stages.
  compiled_where = @query.send(:compile_where)
  markers = @query.send(:compile_markers)
  if compiled_where.present? || markers.key?("__aggregation_pipeline")
    # Collect all match conditions to merge into a single $match stage
    match_conditions = []
    non_match_stages = []

    # `compiled_where` is marker-free already.
    regular_constraints = compiled_where
    if regular_constraints.present?
      aggregation_where = @query.send(:convert_constraints_for_aggregation, regular_constraints)
      stringified_where = @query.send(:convert_dates_for_aggregation, aggregation_where)
      match_conditions << stringified_where
    end

    # Extract aggregation pipeline stages and merge $match stages
    if markers.key?("__aggregation_pipeline")
      markers["__aggregation_pipeline"].each do |stage|
        if stage.is_a?(Hash) && stage.key?("$match")
          # Extract the $match condition for merging
          match_conditions << stage["$match"]
        else
          # Non-$match stages go directly to pipeline
          non_match_stages << stage
        end
      end
    end

    # Combine all match conditions into a single $match stage
    if match_conditions.any?
      if match_conditions.length == 1
        pipeline << { "$match" => match_conditions.first }
      else
        # Use $and to combine multiple match conditions
        pipeline << { "$match" => { "$and" => match_conditions } }
      end
    end

    # Add any non-$match stages from the aggregation pipeline
    pipeline.concat(non_match_stages)
  end

  # Add unwind stage if flatten_arrays is enabled
  if @flatten_arrays
    pipeline << { "$unwind" => "$#{formatted_group_field}" }
  end

  # Add group stage (using count as example aggregation)
  pipeline << {
    "$group" => {
      "_id" => "$#{formatted_group_field}",
      "count" => { "$sum" => 1 },
    },
  }

  # Add $addFields + $sort stages if ordering was configured. Sort happens
  # before $project so we can reference `_id` (pre-rename) for :key sorts.
  add_fields = size_addfields_stage
  pipeline << add_fields if add_fields
  sort = sort_stage
  pipeline << sort if sort

  pipeline << {
    "$project" => {
      "_id" => 0,
      "objectId" => "$_id",
      "count" => 1,
    },
  }

  pipeline
end

#raw(operation, aggregation_expr) ⇒ Array<Hash>

Returns raw unprocessed aggregation results

Parameters:

  • operation (String)

    the aggregation operation

  • aggregation_expr (Hash)

    the MongoDB aggregation expression

Returns:



6339
6340
6341
6342
6343
6344
6345
6346
6347
6348
6349
6350
6351
# File 'lib/parse/query.rb', line 6339

def raw(operation, aggregation_expr)
  formatted_group_field = @query.send(:format_aggregation_field, @group_field)
  pipeline = build_pipeline(formatted_group_field, aggregation_expr)

  response = @query.client.aggregate_pipeline(
    @query.instance_variable_get(:@table),
    pipeline,
    headers: {},
    **@query.send(:_opts),
  )

  response.result || []
end

#sort(direction = :asc) ⇒ self

Sort grouped results by the group key. Alias for order(key: direction), mirroring Ruby's Hash#sort default. For value-based ordering use #order explicitly (e.g. .order(value: :desc)).

Note the asymmetry with chaining: .sort.count pushes the sort into the aggregation pipeline and returns a Hash keyed by group, while .count.sort first materializes the Hash and then calls Hash#sort, which returns an Array<[key, value]>. Both order by key ascending by default; this method exists so the pipeline form is also available.

Examples:

Document.group_by(:category).sort.count        # group keys ascending
Document.group_by(:category).sort(:desc).count # group keys descending

Parameters:

  • direction (Symbol) (defaults to: :asc)

    :asc (default) or :desc

Returns:

  • (self)


6236
6237
6238
# File 'lib/parse/query.rb', line 6236

def sort(direction = :asc)
  order(direction)
end

#sum(field) ⇒ Hash

Sum a field for each group.

Examples:

Document.group_by(:project).sum(:file_size)
# => {"Project1" => 1024000, "Project2" => 512000}

Parameters:

  • field (Symbol, String)

    the field to sum within each group.

Returns:

  • (Hash)

    a hash with group values as keys and sums as values.



6368
6369
6370
6371
6372
6373
6374
6375
# File 'lib/parse/query.rb', line 6368

def sum(field)
  if field.nil? || !field.respond_to?(:to_s)
    raise ArgumentError, "Invalid field name passed to `sum`."
  end

  formatted_field = @query.send(:format_aggregation_field, field)
  execute_group_aggregation("sum", { "$sum" => "$#{formatted_field}" })
end