[VL] Add support of percentile_approx / approx_percentile
Description
Part of https://github.com/apache/incubator-gluten/issues/4039
Velox(presto)'s approx_percentile has different intermediate types with Spark's function with same name.
Velox's signature code of approx_percentile :
void addSignatures(
const std::string& inputType,
const std::string& percentileType,
const std::string& returnType,
std::vector<std::shared_ptr<exec::AggregateFunctionSignature>>&
signatures) {
auto intermediateType = fmt::format(
"row(array(double), boolean, double, integer, bigint, {0}, {0}, array({0}), array(integer))",
inputType);
signatures.push_back(exec::AggregateFunctionSignatureBuilder()
.returnType(returnType)
.intermediateType(intermediateType)
.argumentType(inputType)
.argumentType(percentileType)
.build());
signatures.push_back(exec::AggregateFunctionSignatureBuilder()
.returnType(returnType)
.intermediateType(intermediateType)
.argumentType(inputType)
.argumentType("bigint")
.argumentType(percentileType)
.build());
signatures.push_back(exec::AggregateFunctionSignatureBuilder()
.returnType(returnType)
.intermediateType(intermediateType)
.argumentType(inputType)
.argumentType(percentileType)
.argumentType("double")
.build());
signatures.push_back(exec::AggregateFunctionSignatureBuilder()
.returnType(returnType)
.intermediateType(intermediateType)
.argumentType(inputType)
.argumentType("bigint")
.argumentType(percentileType)
.argumentType("double")
.build());
}
Link https://github.com/facebookincubator/velox/blob/main/velox/functions/prestosql/aggregates/ApproxPercentileAggregate.cpp#L790C1-L828C1
Spark's code of approx_percentile / percentile_approx:
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
cc @liujiayi771 @ulysses-you I took the function in https://github.com/apache/incubator-gluten/issues/4039 but may delay the implementation. I see you have been actively working on aggregation functions in past a few of days so (you or anyone) please feel free to take this issue if interested.
It looks like the new utility introduced by #4764 could help a lot here but may still require for some changes though. I am not sure and you may have more context in there than me.
Thank you @zhztheplayer for driving this, I think @liujiayi771 is the better one to take it :)
Thank you @zhztheplayer for driving this, I think @liujiayi771 is the better one to take it :)
@zhztheplayer I can come to add these aggregate functions, thanks.
Thank you @zhztheplayer for driving this, I think @liujiayi771 is the better one to take it :)
@zhztheplayer I can come to add these aggregate functions, thanks.
@liujiayi771 Glad to hear you'll help. I'll reassign the task to you. Thanks!
I have a draft version before, and I can take over this if you have not start yet @liujiayi771 @zhztheplayer
@WangGuangxin I have not started yet, feel free to take it.