Class: Polars::RollingGroupBy
- Inherits:
-
Object
- Object
- Polars::RollingGroupBy
- Defined in:
- lib/polars/rolling_group_by.rb
Overview
A rolling grouper.
This has an .agg method which will allow you to run all polars expressions in a
group by context.
Instance Method Summary collapse
-
#agg(*aggs, **named_aggs) ⇒ DataFrame
Compute aggregations for each group of a group by operation.
-
#having(*predicates) ⇒ RollingGroupBy
Filter groups with a list of predicates after aggregation.
-
#map_groups(schema, &function) ⇒ DataFrame
Apply a custom/user-defined function (UDF) over the groups as a new DataFrame.
Instance Method Details
#agg(*aggs, **named_aggs) ⇒ DataFrame
Compute aggregations for each group of a group by operation.
65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/polars/rolling_group_by.rb', line 65 def agg(*aggs, **named_aggs) group_by = @df.lazy.rolling( index_column: @time_column, period: @period, offset: @offset, closed: @closed, group_by: @group_by ) if @predicates&.any? group_by = group_by.having(@predicates) end group_by.agg(*aggs, **named_aggs).collect( optimizations: QueryOptFlags.none ) end |
#having(*predicates) ⇒ RollingGroupBy
Filter groups with a list of predicates after aggregation.
Using this method is equivalent to adding the predicates to the aggregation and filtering afterwards.
This method can be chained and all conditions will be combined using &.
42 43 44 45 46 47 48 49 50 51 52 |
# File 'lib/polars/rolling_group_by.rb', line 42 def having(*predicates) RollingGroupBy.new( @df, @time_column, @period, @offset, @closed, @group_by, Utils._chain_predicates(@predicates, predicates) ) end |
#map_groups(schema, &function) ⇒ DataFrame
Apply a custom/user-defined function (UDF) over the groups as a new DataFrame.
Using this is considered an anti-pattern as it will be very slow because:
- it forces the engine to materialize the whole
DataFramesfor the groups. - it is not parallelized.
- it blocks optimizations as the passed python function is opaque to the optimizer.
The idiomatic way to apply custom functions over multiple columns is using:
Polars.struct([my_columns]).map_elements { |struct_series| ... }
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
# File 'lib/polars/rolling_group_by.rb', line 99 def map_groups( schema, &function ) if @predicates&.any? msg = "cannot call `map_groups` when filtering groups with `having`" raise TypeError, msg end @df.lazy .rolling( index_column: @time_column, period: @period, offset: @offset, closed: @closed, group_by: @group_by ) .map_groups(schema, &function) .collect(optimizations: QueryOptFlags.none) end |