datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Support Grouping functions with Group By CUBE/ROLLUP/GROUPING SETS

Open mingmwang opened this issue 2 years ago • 2 comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

PostgreSQL, SparkSQL and Oracle support using GROUPING functions to specify the null is from subtotal or from original data. https://www.postgresql.org/docs/15/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE

Databricks SparkSQL https://docs.databricks.com/sql/language-manual/functions/grouping.html

Oracle https://oracle-base.com/articles/misc/rollup-cube-grouping-functions-and-grouping-sets#grouping

Describe the solution you'd like

Describe alternatives you've considered

Additional context

mingmwang avatar Mar 20 '23 08:03 mingmwang

I'm working on it now.

mingmwang avatar Mar 21 '23 15:03 mingmwang

where is the document of these features?

l1t1 avatar Apr 26 '24 03:04 l1t1

take

bgjackma avatar Sep 18 '24 01:09 bgjackma

I don't think this works as a aggregate function in the physical plan. It depends on the grouping rather than the data, so the current abstraction doesn't have access to the necessary information.

I think this case (along with GROUPING_ID, any others?) will need special handling in GroupHashedAggregateStream.

bgjackma avatar Sep 18 '24 01:09 bgjackma