Distributions.jl icon indicating copy to clipboard operation
Distributions.jl copied to clipboard

Fix MultinomialSampler overflow

Open 0 opened this issue 2 years ago • 4 comments

n^2 doesn't fit into Int64 for n larger than 3_037_000_499.

0 avatar Jun 30 '23 05:06 0

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (b9d3093) 85.90% compared to head (77e132a) 85.90%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1744   +/-   ##
=======================================
  Coverage   85.90%   85.90%           
=======================================
  Files         142      142           
  Lines        8578     8579    +1     
=======================================
+ Hits         7369     7370    +1     
  Misses       1209     1209           
Impacted Files Coverage Δ
src/samplers/multinomial.jl 87.50% <100.00%> (+0.32%) :arrow_up:

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov-commenter avatar Jun 30 '23 05:06 codecov-commenter

Changed sqrt to isqrt.

0 avatar Jul 02 '23 03:07 0

OK :slightly_smiling_face: Now the main remaining question is: is it possible to add a test for the fix? I assume it could be a bit challenging since the change only affects whether an alias table is used or not, but should not affect correctness (I assume).

devmotion avatar Jul 02 '23 20:07 devmotion

Right, this change is about performance and not correctness, since the overflow only affects the heuristic. I initially wrote a test along the lines of

for n in [3_037_000_499, 3_037_000_500]
    @test all(n .== rand(Multinomial(n, 1), 2))
end

but didn't end up including it, because it should pass with both versions of the code anyway. It takes significantly longer using the alias table, but that seems like the wrong thing to check in a unit test.

0 avatar Jul 03 '23 03:07 0