StatsBase.jl icon indicating copy to clipboard operation
StatsBase.jl copied to clipboard

Empirical CDF enhancements

Open alyst opened this issue 3 years ago • 1 comments

The PR improves the performance of ECDF and adds interpolation.

  • The major problem with the current code is that in the weighted case the partial sums have to be recalculated for each input. But this is not necessary, since all CDF values for weighted and non-weighted cases could be precalculated.
  • I suppose ECDF was written in pre-broadcast epoch, so it does not overload Base.Broadcast.broadcasted() for providing enhanced support for vectors. The PR deprecates ecdf(v::AbstractVector) in favor of standard dot notation. I've kept customized broadcasting, but now it just caches the CDF value of the last vector element.
  • Optionally, ECDF can enable interpolation (ecdf(..., interpolate=true)), which is handy for continuous empirical distributions. (It just linearly interpolates between the CDFs of adjacent values).

alyst avatar Aug 09 '20 13:08 alyst

Is there anything here that's still worth it, or should I close?

ParadaCarleton avatar Aug 23 '23 00:08 ParadaCarleton