datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

feat: support `grouping` aggregate function

Open JasonLi-cn opened this issue 1 year ago • 2 comments

Which issue does this PR close?

Closes #.

Rationale for this change

Currently datafusion does not implement grouping function. https://github.com/apache/datafusion/blob/4edbdd7d09d97f361748c086afbd7b3dda972f76/datafusion/physical-expr/src/aggregate/grouping.rs#L80-L84

What changes are included in this PR?

Complete the grouping function.

https://www.postgresql.org/docs/9.5/functions-aggregate.html https://learn.microsoft.com/en-us/sql/t-sql/functions/grouping-transact-sql?view=sql-server-ver15

Are these changes tested?

Yes

Are there any user-facing changes?

Yes. Perhaps we need to include in the documentation instructions for the grouping function. https://arrow.apache.org/datafusion/user-guide/sql/aggregate_functions.html

JasonLi-cn avatar Apr 24 '24 07:04 JasonLi-cn

Related work: https://github.com/apache/datafusion/issues/2477 https://github.com/apache/datafusion/pull/2486

JasonLi-cn avatar Apr 24 '24 07:04 JasonLi-cn

Thanks @JasonLi-cn for this 👍

I noticed this from PG's document:

. The arguments to the GROUPING operation are not actually evaluated, but they must match exactly expressions given in the GROUP BY clause of the associated query level.

Do we need to do some verifications to make sure the param of GROUPING matches GROUP BY? Also I see the implementation of GroupingGroupsAccumulator assumes the input expr is column, but GROUP BY doesn't have such a constrain.

Thanks @waynexia for your suggestion. I agree with you and I will make improvements according to your suggestions.

JasonLi-cn avatar Apr 28 '24 11:04 JasonLi-cn

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jul 21 '24 01:07 github-actions[bot]