datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Oct 16, 2024: This week in DataFusion

Open alamb opened this issue 1 year ago • 3 comments

Introduction

Goal of this ticket is a weekly summary if interesting things happening in DataFusion over the last week. Note this is not a complete list. Please feel free to comment on this ticket about things that I may have missed or you think should get wider attention by the community

Loosely inspired by https://this-week-in-rust.org/

Andrew's TLDR:

We are preparing for the 43.0.0 release and I am personally pretty excited about:

  • https://github.com/apache/datafusion/issues/12821
  • https://github.com/apache/datafusion/issues/8709
  • https://github.com/apache/datafusion/issues/12740

Upcoming Releases

  • https://github.com/apache/datafusion/issues/12813 (thanks @Xuanwo and @matthewmturner)
  • https://github.com/apache/datafusion/issues/12470 (thanks @andygrove)

Project Happenings

  • Integrate sqlparser into DataFusion governance: https://github.com/apache/datafusion-sqlparser-rs/issues/1294#issuecomment-2377918831

Highlights from last week(s):

(I am sorry if I missed you -- please add a note to this ticket with anything you would like to add)

  • @dmitrybugakov started managing extension functions in https://github.com/datafusion-contrib/datafusion-functions-extra
  • @eejbyfeldt is doing some great work on grouping sets such as https://github.com/apache/datafusion/pull/12704
  • @tokoko, @Blizzara @vbarua and @westonpace continue to mature the substrait support such as https://github.com/apache/datafusion/pull/12800
  • Along with @devanbenz and @Rachelint and @jayzhan211 I implemented https://github.com/apache/datafusion/pull/12792 to help clickbench queries
  • @timesaucer made a beautiful macro https://github.com/apache/datafusion/pull/12846
  • @Rachelint made a beautiful aggregation fuzzing proect
  • @jonahgao continues to make our SQL handling more beautiful and correct (https://github.com/apache/datafusion/pull/12808, https://github.com/apache/datafusion/pull/12844, etc)

Performance

  • https://github.com/apache/datafusion/issues/12821 (thanks to the epic work of @Rachelint, @goldmedal, @jayzhan211, @Dandandan @XiangpengHao and others, we are quite close)
  • https://github.com/apache/datafusion/issues/12680 (kudos to @jayzhan211 and @Rachelint)
  • @simonvandel and @tlmn https://github.com/apache/datafusion/pull/12890

Quality

  • https://github.com/apache/datafusion/issues/12114 (already found several bugs -- thanks @Rachelint)

Extensibility

  • Very close to finishing https://github.com/apache/datafusion/issues/8709 (thanks @jcsherin @jatin510 @hailelagi)
  • @Omega359 started https://github.com/apache/datafusion/issues/12740 and we are making great progress thanks to @jonathanc-n @juroberttyb and others
  • @notfillipo and @findepi are working to better separate logical and physical types https://github.com/apache/datafusion/issues/12622

Features

Interesting discussions underway:

  • https://github.com/apache/datafusion/issues/11442
  • https://github.com/apache/datafusion/issues/12357

Community

Upcoming meetups:

Background:

I got some great feedback from @timsaucer, @findepi and @andygrove on the DataFusion weekly call that having a weekly summary like https://github.com/apache/datafusion/issues/12494 was helpful. I will therefore try to write up one each week

alamb avatar Oct 16 '24 16:10 alamb

@alamb I really like this, keeping one up each week would be great. Gives everybody a good direction to go in for the overall project. Thanks for writing this!

jonathanc-n avatar Oct 17 '24 02:10 jonathanc-n

A discussion about meetup in Amsterdam:

  • https://github.com/apache/datafusion/discussions/12988

alamb avatar Oct 17 '24 13:10 alamb

Something else I hope to highlight next week is how the process of reviewing PRs helps understand the code, helps the community, and drives the process forward

alamb avatar Oct 20 '24 20:10 alamb

Next week's issue: https://github.com/apache/datafusion/issues/13035

alamb avatar Oct 21 '24 13:10 alamb