Andrew Lamb

Results 1636 comments of Andrew Lamb

I am now officially out of time and excuses -- I need to write this post soon

Started gathering ideas https://github.com/apache/datafusion-site/pull/6

I plan one more round of copyediting and then posting in 2 days. Please leave comments if you have any: https://github.com/apache/datafusion-site/pull/6

Blog post is live: https://datafusion.apache.org/blog/2024/07/24/datafusion-40.0.0/

Filed https://github.com/apache/datafusion/issues/11631 to track the next one

> Do you think we should implement a special is_min/is_max since these are pervasive and used also for statistics ? It might make sense to special case `is_min` and `is_max`...

Specifically I think these checks need to be removed: https://github.com/apache/datafusion/blob/4838cfbf453f3c21d9c5a84f9577329dd78aa763/datafusion/physical-optimizer/src/aggregate_statistics.rs#L269-L292

Here is one specific suggestion: https://github.com/apache/datafusion/pull/12296#discussion_r1747254563

Hi @jwimberl -- I wonder if this is related to https://github.com/apache/arrow-datafusion/issues/7848 which was fixed in https://github.com/apache/arrow-datafusion/pull/8020 by @korowa What was happening there was that the entire join output was created...

> Yes, is 36.0.0 the first version to include this fix? Yes, that is my understanding of the release notes: https://github.com/apache/arrow-datafusion/blob/main/dev/changelog/36.0.0.md#3600-2024-02-16