datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Feat: support bit_count function

Open kazantsev-maksim opened this issue 8 months ago • 3 comments

Which issue does this PR close?

Related to Epic: https://github.com/apache/datafusion-comet/issues/240 bit_count: SELECT bit_count(0) => 0 DataFusionComet bit_count has same behavior with Spark 's bit_count function Spark: https://spark.apache.org/docs/latest/api/sql/index.html#bit_count

Closes #.

Rationale for this change

Defined under Epic: https://github.com/apache/datafusion-comet/issues/240

What changes are included in this PR?

bitwise_count.rs: impl for bit_count function planner.rs: Maps Spark 's bit_count function to DataFusionComet bit_count physical expression from Spark physical expression expr.proto: bit_count has been added, QueryPlanSerde.scala: bit_count pattern matching case has been added, CometExpressionSuite.scala: A new UT has been added for bit_count function.

How are these changes tested?

A new UT has been added.

kazantsev-maksim avatar Apr 02 '25 18:04 kazantsev-maksim

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 59.47%. Comparing base (f09f8af) to head (b73149d). Report is 218 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1602      +/-   ##
============================================
+ Coverage     56.12%   59.47%   +3.34%     
- Complexity      976     1147     +171     
============================================
  Files           119      128       +9     
  Lines         11743    12532     +789     
  Branches       2251     2356     +105     
============================================
+ Hits           6591     7453     +862     
+ Misses         4012     3889     -123     
- Partials       1140     1190      +50     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov-commenter avatar Apr 02 '25 19:04 codecov-commenter

DataFusion's bit_count has same behavior with Spark 's bit_count function Spark

If this is the case, can we delegate to a ScalarFunc expression instead of creating a new one? Similar to: https://github.com/apache/datafusion-comet/pull/1700/files#diff-602f1658f4020a9dc8a47f49ac1411735d56d6612aa971751435104e301a9035

mbutrovich avatar Apr 30 '25 22:04 mbutrovich

@mbutrovich I couldn't find a built-in implementation of bit_count in the DataFusion project, but i rewrote it using scalarFunc without adding a new proto expr.

kazantsev-maksim avatar May 01 '25 14:05 kazantsev-maksim

Thank you for the feedback! @kazuyukitanimura @parthchandra

kazantsev-maksim avatar May 30 '25 17:05 kazantsev-maksim

Merged Thank you @kazantsev-maksim @parthchandra @mbutrovich

kazuyukitanimura avatar May 30 '25 19:05 kazuyukitanimura