incubator-gluten icon indicating copy to clipboard operation
incubator-gluten copied to clipboard

[VL] Add support of percentile_approx / approx_percentile

Open zhztheplayer opened this issue 1 year ago • 6 comments

Description

Part of https://github.com/apache/incubator-gluten/issues/4039

Velox(presto)'s approx_percentile has different intermediate types with Spark's function with same name.

Velox's signature code of approx_percentile :

void addSignatures(
    const std::string& inputType,
    const std::string& percentileType,
    const std::string& returnType,
    std::vector<std::shared_ptr<exec::AggregateFunctionSignature>>&
        signatures) {
  auto intermediateType = fmt::format(
      "row(array(double), boolean, double, integer, bigint, {0}, {0}, array({0}), array(integer))",
      inputType);
  signatures.push_back(exec::AggregateFunctionSignatureBuilder()
                           .returnType(returnType)
                           .intermediateType(intermediateType)
                           .argumentType(inputType)
                           .argumentType(percentileType)
                           .build());
  signatures.push_back(exec::AggregateFunctionSignatureBuilder()
                           .returnType(returnType)
                           .intermediateType(intermediateType)
                           .argumentType(inputType)
                           .argumentType("bigint")
                           .argumentType(percentileType)
                           .build());
  signatures.push_back(exec::AggregateFunctionSignatureBuilder()
                           .returnType(returnType)
                           .intermediateType(intermediateType)
                           .argumentType(inputType)
                           .argumentType(percentileType)
                           .argumentType("double")
                           .build());
  signatures.push_back(exec::AggregateFunctionSignatureBuilder()
                           .returnType(returnType)
                           .intermediateType(intermediateType)
                           .argumentType(inputType)
                           .argumentType("bigint")
                           .argumentType(percentileType)
                           .argumentType("double")
                           .build());
}

Link https://github.com/facebookincubator/velox/blob/main/velox/functions/prestosql/aggregates/ApproxPercentileAggregate.cpp#L790C1-L828C1

Spark's code of approx_percentile / percentile_approx: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala

zhztheplayer avatar Mar 08 '24 00:03 zhztheplayer

cc @liujiayi771 @ulysses-you I took the function in https://github.com/apache/incubator-gluten/issues/4039 but may delay the implementation. I see you have been actively working on aggregation functions in past a few of days so (you or anyone) please feel free to take this issue if interested.

It looks like the new utility introduced by #4764 could help a lot here but may still require for some changes though. I am not sure and you may have more context in there than me.

zhztheplayer avatar Mar 08 '24 00:03 zhztheplayer

Thank you @zhztheplayer for driving this, I think @liujiayi771 is the better one to take it :)

ulysses-you avatar Mar 08 '24 01:03 ulysses-you

Thank you @zhztheplayer for driving this, I think @liujiayi771 is the better one to take it :)

@zhztheplayer I can come to add these aggregate functions, thanks.

liujiayi771 avatar Mar 08 '24 01:03 liujiayi771

Thank you @zhztheplayer for driving this, I think @liujiayi771 is the better one to take it :)

@zhztheplayer I can come to add these aggregate functions, thanks.

@liujiayi771 Glad to hear you'll help. I'll reassign the task to you. Thanks!

zhztheplayer avatar Mar 08 '24 01:03 zhztheplayer

I have a draft version before, and I can take over this if you have not start yet @liujiayi771 @zhztheplayer

WangGuangxin avatar Mar 13 '24 07:03 WangGuangxin

@WangGuangxin I have not started yet, feel free to take it.

liujiayi771 avatar Mar 13 '24 07:03 liujiayi771