datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

[datafusion-spark] Test integrating datafusion-spark code into comet

Open alamb opened this issue 7 months ago • 1 comments

What is the problem the feature request solves?

  • Part of https://github.com/apache/datafusion/issues/15914

@shehabgamin added the datafusion-spark crate in https://github.com/apache/datafusion/pull/15168

The goal is to help centralize the development of this function library in the core repository rather than duplicated effort

To verify this is feasible I would like to verify that the setup created in datafusion-spark / https://github.com/apache/datafusion/pull/15168 can be used by downstream crates like comet before we go too far

Describe the potential solution

I would like someone to make a PR that shows we can remove one or more of the spark function in datafusion-spark can be used in comet (and reduce the code here)

Additional context

cc @comphead @andygrove @mbutrovich

alamb avatar May 02 '25 00:05 alamb

I created https://github.com/apache/datafusion/pull/15947 to add Comet's hex function to datafusion-spark. This involved converting it from a PhysicalExpr to a ScalarUDFImpl.

Also, I created https://github.com/apache/datafusion-comet/pull/1711 to use the expm1 function from datafusion-spark in Comet.

andygrove avatar May 05 '25 16:05 andygrove

https://github.com/apache/datafusion-comet/pull/1711 is now merged, so Comet is now using the datafusion-spark crate

andygrove avatar May 22 '25 18:05 andygrove

I will keep this issue open until we actually remove an expression from comet and use a version from data fusion-spark

andygrove avatar May 22 '25 18:05 andygrove

#1711 is now merged, so Comet is now using the datafusion-spark crate

EPIC! FYI @shehabgamin

alamb avatar May 22 '25 18:05 alamb