datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

New function in Datafusion 50 to migrate comet

Open davidlghellin opened this issue 3 months ago • 2 comments

What is the problem the feature request solves?

I'm trying to figure out which functions we need to add for the DataFusion 50 upgrade. I'd still need to review everything, but this could be a good starting point for the idea I had.

In Datafusion Proyect

git checkout 49.0.2
❯ cd datafusion/spark
for f in **/mod.rs(.N); do
  # empty vec
  if rg -nUP 'pub fn functions\(\)[\s\S]*?vec!\[\s*\S' "$f" >/dev/null; then
    echo "---- $f ----"
    rg -nUP 'pub fn functions\(\)[\s\S]*?vec!\[([\s\S]*?\S[\s\S]*?)\]' -or '$1' "$f"
    echo
  fi
done > datafusion-49
❯ git checkout 50.0.0
M	datafusion-testing
Previous HEAD position was f43df3f2a [branch-49] Prepare `49.0.2` version and changelog (#17277)
HEAD is now at 10343c182 Revert #17295 (Support from-first SQL syntax) (#17520) (#17544)
❯ cd ../..
❯ cd datafusion/spark
for f in **/mod.rs(.N); do
  # empty vec
  if rg -nUP 'pub fn functions\(\)[\s\S]*?vec!\[\s*\S' "$f" >/dev/null; then
    echo "---- $f ----"
    rg -nUP 'pub fn functions\(\)[\s\S]*?vec!\[([\s\S]*?\S[\s\S]*?)\]' -or '$1' "$f"
    echo
  fi
done > datafusion-50
❯ diff datafusion-49 datafusion-50
3a4
> 32:array()
4a6,8
> ---- src/function/bitmap/mod.rs ----
> 36:bitmap_count()
> 
5a10
> 39:bit_get(), bit_count()
9a15
> 32:r#if()
15a22
> 59:date_add(), date_sub(), last_day(), next_day()
20c27
< 31:sha2()
---
> 39:crc32(), sha1(), sha2()
29c36,43
< 42:expm1(), factorial(), hex()
---
> 54:        expm1(),
> 55:        factorial(),
> 56:        hex(),
> 57:        modulus(),
> 58:        pmod(),
> 59:        rint(),
> 60:        width_bucket(),
> 61:    
36c50
< 43:ascii(), char()
---
> 64:ascii(), char(), ilike(), like(), luhn_check()
42a57
> 32:parse_url()
Image

New functions to migrate/add

  • [ ] bitmap_count
  • [ ] if
  • [ ] bit_get()
  • [ ] bit_count()
  • [x] date_add()
  • [x] date_sub()
  • [ ] last_day()
  • [ ] next_day()
  • [ ] crc32()
  • [x] sha1()
  • [ ] modulus()
  • [ ] pmod()
  • [ ] rint()
  • [ ] width_bucket()
  • [ ] ilike()
  • [ ] like()
  • [ ] luhn_check()
  • [ ] parse_url()

Describe the potential solution

refactor or add new function

Additional context

copy from https://github.com/lakehq/sail/issues/907

I see in #2286 its the same version

davidlghellin avatar Sep 23 '25 12:09 davidlghellin

If this isn’t necessary or it’s already being tracked by another issue, please feel free to close it. Thanks!

davidlghellin avatar Sep 23 '25 12:09 davidlghellin

Thanks @davidlghellin. Related issue: https://github.com/apache/datafusion-comet/issues/2084

andygrove avatar Sep 24 '25 21:09 andygrove