vaex
vaex copied to clipboard
🐛 Expression.minmax() not returning proper value for datetimes
Fixes #1456
I left some comments in the #development Slack channel, as it looks like min()
, max()
and minmax()
were using different approaches. I'm not sure which is more efficient/robust long-term, but I opted for the simpler solution.
Note: In this solution, StatOpMinMax
is no longer used anywhere in the code base.
Somehow missed all of those tests failing. Will check again on this PR later tonight.
yeah, seems odd, you could try a rebase.. although I don't remember seeing these issues before.
The latest commit should fix the issue. I was returning a list instead of the np array like before. Also, my previous implementation didn't respect delayed execution, so I changed the code to use the previous structure and just swapped out the self.limits()
call for min()
and max()
, respectively.
This may suggest that the root case bug in limits()
(or more likely, StatOpMinMax
?) is still present for my added test case. But as stated above in the original PR comment, this now makes StatOpMinMax
dead code.
Not sure why it's failing on non-Windows builds... The errors in the Linux and Mac builds are notebook timeout errors.
Ah don't worry about it. Those errors are unrelated. I hope we can clean them up soon. I don't think that the notebooks run on the windows CI which is why that part does not fail :).
Looks good to me! But let's wait for @maartenbreddels to also give his blessing :)
Thanks!
Thanks for this PR! However, this reminds me that we actually need to have a minmax aggregator, now we lose a bit of performance by doing a min and a max separately. @JovanVeljanoski what do you think?
Ups I completely forgot about this one :(
Well, indeed it would be great to do min and max in one pass over the data. If one is interested only in one of them, is there any performance benefit to doing only min for example? @maartenbreddels
Yes, we have both min, max and minmax for that reason.