Chore: refactor bit_count
Which issue does this PR close?
Part of: https://github.com/apache/datafusion-comet/issues/2443
Rationale for this change
Part of: https://github.com/apache/datafusion-comet/issues/2443
What changes are included in this PR?
How are these changes tested?
Tested with existing unit tests
The DataFusion implementation produces results that are not compatible with Spark
Project [bit_count(col2#2190) AS bit_count(col2)#2218] +- Relation spark_catalog.default.bitwise_count_test[col1#2189L,col2#2190,col3#2191,col4#2192] parquet
== Physical Plan == *(1) CometColumnarToRow +- CometProject [bit_count(col2)#2218], [bit_count(col2#2190) AS bit_count(col2)#2218] +- CometScan [native_iceberg_compat] parquet spark_catalog.default.bitwise_count_test[col2#2190] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:/Users/tendoo/Desktop/datafusion-comet/spark/spark-warehouse/bitw..., PartitionFilters: [], PushedFilters: [], ReadSchema: structcol2:int
== Results ==
== Results == !== Correct Answer - 3 == == Spark Answer - 3 == struct<bit_count(col2):int> struct<bit_count(col2):int> ![31] [1] ![33] [31] [6] [6] (QueryTest.scala:244)
@andygrove @comphead
Triggering CI
there is still a conflict 🤔
Thanks @comphead, conflicts resolved.
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 45.97%. Comparing base (f09f8af) to head (f03f59d).
:warning: Report is 624 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #2553 +/- ##
=============================================
- Coverage 56.12% 45.97% -10.16%
- Complexity 976 1200 +224
=============================================
Files 119 146 +27
Lines 11743 13746 +2003
Branches 2251 2354 +103
=============================================
- Hits 6591 6320 -271
- Misses 4012 6385 +2373
+ Partials 1140 1041 -99
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
*** 3 TESTS FAILED ***
*** 3 TESTS FAILED ***
I think the DataFusion implementation needs modification to be fully compatible with Spark.
After closing the task (https://github.com/apache/datafusion/pull/18322 ), the tests must pass. FYI @comphead