StatsBase.jl icon indicating copy to clipboard operation
StatsBase.jl copied to clipboard

fixed floating point error

Open dnabanita7 opened this issue 4 years ago • 11 comments

related to issue #712

dnabanita7 avatar Aug 31 '21 15:08 dnabanita7

Thanks for the PR! However, I'm not sure the docs should correspond exactly to the implementation. The docstring gives the mathematical formula, and the implementation is slightly different for performance reasons. Floating point approximations are generally not considered as significant. As was noted on Discourse, even 3/10 == 3*0.1 doesn't hold in floating point. As long as isapprox holds, everything is fine.

nalimilan avatar Aug 31 '21 20:08 nalimilan

Alright! Should I just mention floating-point precision or anything in the docstrings? Or should I close the PR?

dnabanita7 avatar Aug 31 '21 23:08 dnabanita7

I think the existing version is fine. Separately from that, I also think this should be implemented with division instead of multiplication with the inverse.

andreasnoack avatar Sep 01 '21 11:09 andreasnoack

Separately from that, I also think this should be implemented with division instead of multiplication with the inverse.

I also wondered whether using the inverse was really a good idea. I imagine it can only be faster if the range of levels is very wide compared with the number of values, but even then it's not obvious that it would make a big difference.

@dnabanita7 Would you feel like checking whether using / for the implementation rather than inv would be acceptable for performance?

nalimilan avatar Sep 01 '21 12:09 nalimilan

I think it is not making much difference with shorter collections.

julia> @btime counts(x1, s1) .* inv(length(x1))
  546.541 ns (5 allocations: 368 bytes)
7-element Vector{Float64}:
 0.1
 0.2
 0.1
 0.1
 0.30000000000000004
 0.0
 0.2

julia> @btime counts(x1, s1) / length(x1)
  133.574 ns (2 allocations: 288 bytes)
7-element Vector{Float64}:
 0.1
 0.2
 0.1
 0.1
 0.3
 0.0
 0.2

It gives better performance with larger collections.

julia> @btime counts(x1, s1) / length(x1)
25.491 s (5 allocations: 7.45 GiB)
500000000-element Vector{Float64}:

julia> @btime counts(x1, s1) .* inv(length(x1))
  28.704 s (8 allocations: 7.45 GiB)
500000000-element Vector{Float64}:

dnabanita7 avatar Sep 01 '21 15:09 dnabanita7

What are x1 and s1 in these cases? I couldn't find cases where performance differs between inv and /.

nalimilan avatar Sep 02 '21 19:09 nalimilan

So, for the first case, x1, s1 are 7 element vector as in the output and for the second case x1, s1 is a random generated 5 * 10^8 element vector. As you can see in the second case, inv takes up more number of memory allocations as well as more time around 3 seconds than /.

dnabanita7 avatar Sep 03 '21 05:09 dnabanita7

Hmm, I still don't get it. In general, posting code is more explicit.

Anyway, inv doesn't seem faster, so let's switch to /. Would you update the PR?

(BTW, be careful when looking at the number of allocations with @btime: it's safer to wrap the code in a function, in particular when it uses broadcasting. When doing that I see 4 allocations for both.)

nalimilan avatar Sep 03 '21 08:09 nalimilan

I meant that we should switch the implementation, not the docs.

nalimilan avatar Sep 05 '21 16:09 nalimilan

oh sure, right!

dnabanita7 avatar Sep 06 '21 05:09 dnabanita7

cc @nalimilan

dnabanita7 avatar Sep 14 '21 00:09 dnabanita7