Support for Selective Aggregates, Filter clause
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
PostgreSQL supports the SQL Filter Clause which is a clause that filters certain rows based on the defined row expressions before an aggregation is performed. Currently Datafusion does not provide a mechanism for parsing those clauses. See Filter Clause for more in depth details on the clauses behavior.
Describe the solution you'd like
The datafusion::logical_plan::plan::Aggregate struct should include a new member Ex: pub filter_expr: Vec<Expr> which contains the filtering expressions that could be applied by the consuming engine before performing the actual aggregations that are defined in pub aggr_expr: Vec<Expr>
Describe alternatives you've considered None
Additional context Description of the syntax and functionality can be found here
Note that for most aggregation functions this could be done purely on logical plan level by rewriting AGGREGATE(input) FILTER (WHERE condition) to AGGREGATE(IF(condition, input, NULL)). This works because aggregations usually ignore NULL values themselves. One exception I can think of would be ARRAY_AGG which I think keeps NULL values.
I would love to pick this and work on it
There is a related PR to add support in the SQL query planner and logical plan, but does not add physical plan support: https://github.com/apache/arrow-datafusion/pull/3405
Excited!!. I've implemented PhysicalExpr with filter support. I'll raise an PR with relevant changes after the mentioned PR get merged.