[GLUTEN-4889][VL] Support approx_percentile
What changes were proposed in this pull request?
Support approx_percentile for VL backend
(Fixes: #4889)
How was this patch tested?
UT
https://github.com/apache/incubator-gluten/issues/4889
Run Gluten Clickhouse CI
CC: @zhztheplayer
I believe the change on Gluten side is ready now. But the ApproxPercentileAggregate implements in velox has to make some minor modification, becasue when add intermediate data, it checks the row vector encoding in addIntermediateImpl
https://github.com/facebookincubator/velox/blob/main/velox/functions/prestosql/aggregates/ApproxPercentileAggregate.cpp#L652 ,
I believe the encoding is guaranteed by extractAccumulators https://github.com/facebookincubator/velox/blob/main/velox/functions/prestosql/aggregates/ApproxPercentileAggregate.cpp#L262.
But when it comes in Gluten, the output of PartialAgg are processed by Shuffle, which will flatten all vectors, so that the encoding is not guaranteed when added to FinalAgg. https://github.com/apache/incubator-gluten/blob/main/cpp/velox/shuffle/VeloxShuffleWriter.cc#L343
cc @zhztheplayer @liujiayi771 @ulysses-you
Thank you in advance!
I believe the change on Gluten side is ready now.
Did that mean this PR can be merged prior to merging Velox changes?
@WangGuangxin Perhaps we could create an issue in the upstream Velox community to remove this verification.
Hi @WangGuangxin so the current plan is to merge
VELOX_REPO=https://github.com/wangguangxin/velox.git VELOX_BRANCH=2024_03_15_approx_percentile
to upstream Velox before this ? Am I understanding correctly?
this
yes, we should merge to upsteam first before merge this PR
@WangGuangxin Any progress of this PR?