datafusion
datafusion copied to clipboard
Release DataFusion 43.0.0
Is your feature request related to a problem or challenge?
Tracking ticket for next release, also a place to track desired inclusions
Last release was https://crates.io/crates/datafusion/42.0.0 September 17th, 2024 so next major release would be around October 20, 2024
Prior release tickets:
- 42.0.0 https://github.com/apache/datafusion/issues/11902
Desired Items that would be good to get into this release:
- [ ] (Andrew's goal) https://github.com/apache/datafusion/issues/11682
- [ ] (Andrew's goal) https://github.com/apache/datafusion/issues/3463
Items to fix before release
(TBD)
I'm keen to start the 43.0.0 release process as soon as we have upgraded to arrow-rs 53.1.0 since it will unblock https://github.com/apache/datafusion-ray/issues/10
I'm keen to start the 43.0.0 release process as soon as we have upgraded to arrow-rs 53.1.0 since it will unblock apache/datafusion-ray#10
The arrow upgrade is all ready here: https://github.com/apache/datafusion/pull/12724
Since the arrow upgrade is a minor version, you won't have to wait for the datafusion upgrade to get the latest version of arrow-rs
It has been 28 days since the last release to crates.io, so we should start planning this release.
@alamb would you still like to wait for the items mentioned in the description?
Thanks @andygrove -- I think we should start planning. Here are some items I think we should include:
Required
- [ ] Bug fixes for metadata mismatches (@wiedld has a few more she will file over the next day or so): https://github.com/apache/datafusion/issues/12733
I would love to get these as well
- [ ] https://github.com/apache/datafusion/issues/12771 (should merge tomorrow)
- [ ] https://github.com/apache/datafusion/issues/12788 (needs an arrow-rs upgrade which isn't scheduled until Nov: https://github.com/apache/arrow-rs/issues/6341 -- though I could make a release sooner if you are willing to hold the DF release until next week)
Other than https://github.com/apache/datafusion/issues/12788 we are on track I think to be ready to release in the next day or two
Thanks @alamb. I am in no rush for the release myself.
A "nice to have" for me is #12969 if it is ready in time, but it should not block the release (and could potentially be back-ported to a 42.x.x release)
BTW I am going to accelerate the timeline to release arrow 53.2.0 so we can potentially include it in the next datafusion release: https://github.com/apache/arrow-rs/issues/6341
I hope to make the RC today
This came up on the sync call today
I would like consider doing this item before we release (as this check has caused significant pain downstream in InfluxDB 3.0 as well as delta-rs (see links on ticket):
- [ ] https://github.com/apache/datafusion/issues/13065
If we did that I view everything else as nice to have, including
- https://github.com/apache/datafusion/issues/12733 (there are still some additional outstanding issues)
- https://github.com/apache/datafusion/issues/11682 as much as I would like to turn this on by default, I think it would be a bad idea to turn it on by default right before a release. It would be better to merge it to main and give it some bake time
Here is the PR to use StringView when reading from Parquet files: https://github.com/apache/datafusion/pull/13101
There is a regression for CREATE TABLE https://github.com/apache/datafusion/pull/12864#issuecomment-2437123342 This should be resolved before release (or judged OK to break).
There is a regression for CREATE TABLE #12864 (comment) This should be resolved before release (or judged OK to break).
Filed I have filed https://github.com/apache/datafusion/issues/13124 to track the
I am getting worried about the number of changes that are accumulating unreleased.
Here is my proposal:
- Add the "turn off the schema check" PR: https://github.com/apache/datafusion/issues/13065 (any volunteers to help with this?)
- Backport/make another 42.2 release with just that fix
- Create the 43 RC
@andygrove what are your thoughts timingwise for 43? I feel like we are accumulating substantial API changes and it might be nice to release sooner rather than later
Do you need an assist with any portion of this?
Thank yoU @timsaucer ❤️
I am not sure -- I haven't heard from @andygrove -- perhaps we should ask him in Slack?
If he is willing, I think it would be great if you wanted to create the PR to update the versions and readme as documented in https://github.com/apache/datafusion/blob/main/dev/release/README.md#change-log
I am here (now). I can start the RC process this evening if we are not waiting on anything else
I would like to get this in, but I have yet to find a reviewer: https://github.com/apache/datafusion/pull/13183
The release has been approved and published to crates.io
Minor follow on:
- https://github.com/apache/datafusion/pull/13333
Tracking 44.0.0:
- https://github.com/apache/datafusion/issues/13334
Thanks again @andygrove