trino
trino copied to clipboard
Calculate roll-up cumulatively
Currently, rollup on x, y, z will translate into plan:
FinalAggregregation[$id, x, y, z](aggr)
RemoteExchange
PartialAggregation[$id, x, y, z](aggr)
GroupId
GroupId
operator will multiply unaggregated input data 4 times (for groups [], [x], [x,y], [x,y,z]
).
PartialAggregation
will consume unaggregated input for each grouping set. However, aggregations could be calculated in a cascade way, e.g:
PartialAggregation[$id=0][source: $id=1](aggr)
PartialAggregation[$id=1, x][source: $id=2](aggr)
PartialAggregation[$id=2, x, y][source: $id=3](aggr)
PartialAggregation[$id=3, x, y, z](aggr)
GroupId
Downstream PartialAggregations
would use already partially aggregated results from wider grouping set, while passing through input rows.
This would reduce CPU, but also improve PartialAggregation
efficiency as each aggregation is computed with separate operator
Initial branch https://github.com/starburstdata/trino/tree/ks/rollup