datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Release DataFusion 43.0.0

Open alamb opened this issue 1 year ago • 6 comments

Is your feature request related to a problem or challenge?

Tracking ticket for next release, also a place to track desired inclusions

Last release was https://crates.io/crates/datafusion/42.0.0 September 17th, 2024 so next major release would be around October 20, 2024

Prior release tickets:

  • 42.0.0 https://github.com/apache/datafusion/issues/11902

Desired Items that would be good to get into this release:

  • [ ] (Andrew's goal) https://github.com/apache/datafusion/issues/11682
  • [ ] (Andrew's goal) https://github.com/apache/datafusion/issues/3463

Items to fix before release

(TBD)

alamb avatar Sep 15 '24 11:09 alamb

I'm keen to start the 43.0.0 release process as soon as we have upgraded to arrow-rs 53.1.0 since it will unblock https://github.com/apache/datafusion-ray/issues/10

andygrove avatar Oct 04 '24 15:10 andygrove

I'm keen to start the 43.0.0 release process as soon as we have upgraded to arrow-rs 53.1.0 since it will unblock apache/datafusion-ray#10

The arrow upgrade is all ready here: https://github.com/apache/datafusion/pull/12724

Since the arrow upgrade is a minor version, you won't have to wait for the datafusion upgrade to get the latest version of arrow-rs

alamb avatar Oct 04 '24 17:10 alamb

It has been 28 days since the last release to crates.io, so we should start planning this release.

@alamb would you still like to wait for the items mentioned in the description?

andygrove avatar Oct 15 '24 19:10 andygrove

Thanks @andygrove -- I think we should start planning. Here are some items I think we should include:

Required

  • [ ] Bug fixes for metadata mismatches (@wiedld has a few more she will file over the next day or so): https://github.com/apache/datafusion/issues/12733

I would love to get these as well

  • [ ] https://github.com/apache/datafusion/issues/12771 (should merge tomorrow)
  • [ ] https://github.com/apache/datafusion/issues/12788 (needs an arrow-rs upgrade which isn't scheduled until Nov: https://github.com/apache/arrow-rs/issues/6341 -- though I could make a release sooner if you are willing to hold the DF release until next week)

Other than https://github.com/apache/datafusion/issues/12788 we are on track I think to be ready to release in the next day or two

alamb avatar Oct 15 '24 22:10 alamb

Thanks @alamb. I am in no rush for the release myself.

A "nice to have" for me is #12969 if it is ready in time, but it should not block the release (and could potentially be back-ported to a 42.x.x release)

andygrove avatar Oct 17 '24 14:10 andygrove

BTW I am going to accelerate the timeline to release arrow 53.2.0 so we can potentially include it in the next datafusion release: https://github.com/apache/arrow-rs/issues/6341

I hope to make the RC today

alamb avatar Oct 20 '24 11:10 alamb

This came up on the sync call today

I would like consider doing this item before we release (as this check has caused significant pain downstream in InfluxDB 3.0 as well as delta-rs (see links on ticket):

  • [ ] https://github.com/apache/datafusion/issues/13065

If we did that I view everything else as nice to have, including

  • https://github.com/apache/datafusion/issues/12733 (there are still some additional outstanding issues)
  • https://github.com/apache/datafusion/issues/11682 as much as I would like to turn this on by default, I think it would be a bad idea to turn it on by default right before a release. It would be better to merge it to main and give it some bake time

alamb avatar Oct 23 '24 15:10 alamb

Here is the PR to use StringView when reading from Parquet files: https://github.com/apache/datafusion/pull/13101

alamb avatar Oct 25 '24 14:10 alamb

There is a regression for CREATE TABLE https://github.com/apache/datafusion/pull/12864#issuecomment-2437123342 This should be resolved before release (or judged OK to break).

findepi avatar Oct 26 '24 07:10 findepi

There is a regression for CREATE TABLE #12864 (comment) This should be resolved before release (or judged OK to break).

Filed I have filed https://github.com/apache/datafusion/issues/13124 to track the

alamb avatar Oct 26 '24 12:10 alamb

I am getting worried about the number of changes that are accumulating unreleased.

Here is my proposal:

  • Add the "turn off the schema check" PR: https://github.com/apache/datafusion/issues/13065 (any volunteers to help with this?)
  • Backport/make another 42.2 release with just that fix
  • Create the 43 RC

alamb avatar Oct 29 '24 13:10 alamb

@andygrove what are your thoughts timingwise for 43? I feel like we are accumulating substantial API changes and it might be nice to release sooner rather than later

alamb avatar Nov 01 '24 21:11 alamb

Do you need an assist with any portion of this?

timsaucer avatar Nov 04 '24 18:11 timsaucer

Thank yoU @timsaucer ❤️

I am not sure -- I haven't heard from @andygrove -- perhaps we should ask him in Slack?

If he is willing, I think it would be great if you wanted to create the PR to update the versions and readme as documented in https://github.com/apache/datafusion/blob/main/dev/release/README.md#change-log

alamb avatar Nov 04 '24 19:11 alamb

I am here (now). I can start the RC process this evening if we are not waiting on anything else

andygrove avatar Nov 04 '24 22:11 andygrove

I would like to get this in, but I have yet to find a reviewer: https://github.com/apache/datafusion/pull/13183

timsaucer avatar Nov 04 '24 23:11 timsaucer

The release has been approved and published to crates.io

Minor follow on:

  • https://github.com/apache/datafusion/pull/13333

Tracking 44.0.0:

  • https://github.com/apache/datafusion/issues/13334

alamb avatar Nov 10 '24 08:11 alamb

Thanks again @andygrove

alamb avatar Nov 10 '24 08:11 alamb