datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

feat: add hex scalar function

Open tshauck opened this issue 1 year ago • 1 comments

Which issue does this PR close?

Related to https://github.com/apache/datafusion-comet/issues/341.

Rationale for this change

I recently added unhex so this PR adds hex. It's another scalar function that isn't yet implemented in Comet.

I decided to do a native version here because datafusion has an to_hex, but it has different functionality. E.g. to_hex(-1) returns ffffffffffffffff in datafusion vs hex(-1) returning FFFFFFFFFFFFFFFF in spark. Spark hex also supports strings, but datafusion and postgres do not. I'm happy to start the discussion if it seems it could be merged upstream.

What changes are included in this PR?

I added the hex scalar function.

How are these changes tested?

Yes, I've added tests to the rust side as well as spark sql based tests in scala.

tshauck avatar May 18 '24 23:05 tshauck

@kazuyukitanimura @advancedxy ... re-requesting reviews from you two, please. I've updated the code to support dictionaries and removed some of the finer int types. Barring something happening on intel mac, the tests look to pass here: https://github.com/tshauck/arrow-datafusion-comet/actions/runs/9183637020.... thanks

tshauck avatar May 22 '24 04:05 tshauck

I think I made the updates requested in the latest round. I left the dictionary handling the same, but I can look into flattening the dictionary specific to hex if you guys think it's a good idea. Thanks

tshauck avatar May 24 '24 17:05 tshauck