StatsBase.jl
StatsBase.jl copied to clipboard
Empirical CDF enhancements
The PR improves the performance of ECDF and adds interpolation.
- The major problem with the current code is that in the weighted case the partial sums have to be recalculated for each input. But this is not necessary, since all CDF values for weighted and non-weighted cases could be precalculated.
- I suppose ECDF was written in pre-broadcast epoch, so it does not overload
Base.Broadcast.broadcasted()
for providing enhanced support for vectors. The PR deprecatesecdf(v::AbstractVector)
in favor of standard dot notation. I've kept customized broadcasting, but now it just caches the CDF value of the last vector element. - Optionally, ECDF can enable interpolation (
ecdf(..., interpolate=true)
), which is handy for continuous empirical distributions. (It just linearly interpolates between the CDFs of adjacent values).
Is there anything here that's still worth it, or should I close?