pinot
pinot copied to clipboard
Add support for SUM(DISTINCT colA) aka DISTINCTSUM
Currently, pinot only supports using COUNT aggregation function on DISTINCT. This is supported in two ways:
- DISTINCTCOUNT
- COUNT(DISTINCT colA)
https://github.com/apache/pinot/blob/e813867985746e916c8e898a530002551b661496/pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java#L788-L789
The ask in this issue to make pinot also support SUM aggregation function on DISTINCT. As mentioned above, this can be supported using two ways:
- SUM(DISTINCT colA) ---> SQL friendly way
- DISTINCTSUM
Similar support is offered by MySQL - https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_sum
cc @walterddr @Jackie-Jiang
this is not something postgres supports (which is much closer following SQL standard)
- only
COUNT(DISTINCT( <col> [, <col>]* )
however given we already support DISTINCTCOUNT, it seems ok to support DISTINCTSUM as a dialect sugar to me
@vvivekiyer is working on this.
Support has been merged for DISTINCTSUM and DISTINCTAVG for SV columns. Support for MV columns will be done in a follow-up PR.
OSS issue for extending this support to MV columns https://github.com/apache/pinot/issues/10109