BUG: Fixed issue where rolling.kurt() calculations would be effected by values outside of scope
- [x] closes #61416
- [x] Tests added and passed if fixing a bug or adding a new feature
- [x] All code checks passed.
- [x] Added type annotations to new arguments/methods/functions.
- [x] Added an entry in the latest
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.
Might have found an unrelated issue when calculating kurtosis for numbers >1e6, but I'll have to look into it more and open an issue if that is the case.
@mroeschke my PR hasn't been reviewed for a while now, just checking if it will be reviewed or if I should just close it.
(sorry if it's a bother, I know you guys probably all have a lot on your plates and I didn't know who to ping)
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.
I see you've been working on your own PR, have you taken on things from this fix? Have been occupied with school work, so haven't had time to look til now. If not, I can still work on it, just lmk
have you taken on things from this fix?
My approach differs a lot from yours, so no.
More in the solution-sense. I saw a commit for outliers in window values on your PR so I wasn't sure if you've already started tackling the same issue
Got it. I am checking for catastrophic cancellation when updating the 3rd central moment, as it's the most sensible of all. When this happens, I recompute the window.
So should I still fix up this PR then?
Honestly, I don't know. But I think that we should arrive to a general solution for numerical stability (algorithm-wise) to compute the rolling variance, skewness and kurtosis.
I don't know if my solution is good enough, or if your approach is better in terms of stability and performance.
Is this issue with data precision limitations? It's been a minute. I did open an enhancement request for implementing double-double arithmetics so we can work with extremely large and small float64s without multiple people implementing different methods of dealing with numerical stability due to data type. What do you think? Issue: #62870
Is this issue with data precision limitations?
Yes, most of the problems are related to arithmetic problems in floating point numbers. Using a more precise data type or stabler algorithms can mitigate some of these problems.
I did open an enhancement request for implementing double-double arithmetics
Seems good. But for now, it doesn't seem clear to me how it should be implemented and integrated to the existing functionality.
I have two ideas:
- Overload the function at runtime depending on if inputs have 14 digits of sigfig
- Create a separate double-double Cython implementation so we can implement them as needed
Assuming that's what the question was about