Add support for bloom_filter_agg
What is the problem the feature request solves?
Some TPC-H queries use bloom_filter_agg, and Comet does not have a native implementation yet.
A workaround is to set spark.sql.optimizer.runtime.bloomFilter.enabled=false.
Describe the potential solution
No response
Additional context
No response
I can take this up if more details are provided :)
I can take this up if more details are provided :)
We need to implement an equivalent of Spark's org.apache.spark.sql.catalyst.expressions.aggregate.BloomFilterAggregate.
ok, taking this up.
Accumulating some notes for this. Here's the Spark design doc on the feature: https://docs.google.com/document/d/16IEuyLeQlubQkH8YuVuXWKo2-grVIoDJqQpHZrE7q04/
It looks like we already have the filter support thanks to https://github.com/apache/datafusion-comet/pull/179.
BloomFilterAggregate is only supported by ObjectHashAggregate or SortAggregate. But in Comet, we only support HashAggregate so far.