Distributions.jl Implement ExpGamma sampler (draft)

As discussed in #1886 , here is a branch with only the ExpGamma additions I made. This is hopefully of use to @chelate.

Potential TODOs:

Figure out a reasonable threshold to switch between the Liu-Martin-Syring Small Shape sampler and the Maraglia-Tsang Inverse Power sampler.
Possibly implement Algorithm 3 from Kundu and Gupta (https://home.iitk.ac.in/~kundu/paper120.pdf) for when alpha > 0.3

Mar 28 '25 21:03 quildtide

Codecov Report

:x: Patch coverage is 0% with 23 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 85.97%. Comparing base (05a7875) to head (d678904). :warning: Report is 19 commits behind head on master.

Files with missing lines	Patch %	Lines
src/samplers/expgamma.jl	0.00%	23 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1958      +/-   ##
==========================================
- Coverage   86.20%   85.97%   -0.23%     
==========================================
  Files         146      147       +1     
  Lines        8769     8792      +23     
==========================================
  Hits         7559     7559              
- Misses       1210     1233      +23

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Mar 28 '25 21:03 codecov-commenter

I found an issue in ExpGammaIPSampler that is beyond my understanding. The values it generates are too high. If you undo the log on them, the output has a low bound of 1 instead of 0. I don't think it's a sign issue.

julia> rand(rng, GammaIPSampler(Gamma(0.9)), 10)
10-element Vector{Float64}:
 0.2135985770580691
 1.137650455835034
 1.2565944033941983
 0.656188185739729
 0.4203566698386437
 1.66687553573059
 0.2053545534917769
 0.41111662368149415
 0.14513700772532737
 0.6549826409710889

julia> exp.(rand(rng, ExpGammaIPSampler(Gamma(0.9)), 10))
10-element Vector{Float64}:
 2.051378946898306
 1.0193261057366838
 2.0454369373226133
 1.1704569694617193
 1.363270090050132
 3.747017831243905
 2.1354060996412194
 1.0657728289135096
 1.173677653737439
 1.0294127958057897

and

julia> rand(rng, GammaIPSampler(Gamma(0.01)), 10)
10-element Vector{Float64}:
 1.836714536149393e-57
 1.5105073011994874e-50
 3.537147232822205e-36
 1.2547571475373618e-32
 5.885231757348809e-38
 2.9054960529000983e-34
 1.2110309289241029e-6
 1.4231361276813385e-101
 2.393431274684707e-27
 4.809894203977362e-38

julia> exp.(rand(rng, ExpGammaIPSampler(Gamma(0.01)), 10))
10-element Vector{Float64}:
 1.0207709469175534
 1.0000000000000002
 1.0000000000000946
 1.0
 1.0
 1.0
 1.0
 1.0000000000087867
 1.0
 1.000000000000009

The code looks mathematically correct to me.

It's about 2x faster than ExpGammaSSSampler, but ExpGammaSSSampler returns values in the expected range while ExpGammaIPSampler is broken.

Mar 28 '25 23:03 quildtide

I found the issue; one of the ExpGammaIPSampler methods was actually returning a GammaIPSampler.

Mar 29 '25 00:03 quildtide

After correcting that, I have done some basic testing. This means I've checked for rates of non-finite values and duplicates. I have not checked past that. If something "works", it means it's generating unique finite values.

I have found that:

The Log Inverse Power sampler is always faster; runtime is independent of alpha.
The Small Shape sampler is at its fastest at alpha ~= 0.01. At this point, it still takes about 150% as much time as the Log Inverse Power sampler. At its worst, it's probably takes about 400% as much time.
Both samplers perform without issue until around 7e-308
Starting around 7e-308, the Small Shape sampler performs slightly worse (~10% more -Infs) until around 7e-309
The Log Inverse Power sampler rapidly fails below 7e-309, generating almost pure -Infs by 5e-309
The Small Shape sampler continues to generate unique finite values down to around 1e-314, but the rate is only about 2% by 1e-310

I'll make some charts later.

Note that I have NOT looked into how "accurate" the numbers generated have been yet. It is very much possible that the Log Inverse Power sampler is generating nonsense at some point.

Mar 29 '25 01:03 quildtide

Thanks for the work and documentation! I am able to look into this in July, but I will probably need lots of help because I am not much of a coder. That said, I am really really annoyed by the lack of a ExpGamma sampler and will do my best.

Personally, I don't think what happens below 1e-308 is really important, and the IPS sample seems to "dominate" to use the statistical parlance. I mean with gamma occasionally giving zeros at like 1e-3 or something, Id hardly worry about something that is slightly suboptimal 300 orders of magnitude later, right?

Jun 19 '25 10:06 chelate

Distributions.jl Distributions.jl copied to clipboard

Implement ExpGamma sampler (draft)

Codecov Report

Distributions.jl
Distributions.jl copied to clipboard