mathnet-numerics icon indicating copy to clipboard operation
mathnet-numerics copied to clipboard

Error in sampling from Negative Binomial

Open Mlambrechts opened this issue 9 years ago • 4 comments

I'm using Math.Net to sample values from an overdispersed Poisson distribution. I'm doing this using the negative binomial link, as described here: https://stat.ethz.ch/pipermail/r-help/2002-June/022425.html

My code currently looks like this:

private static double ODPoisson(double lambda, double dispersion)
{
    double p = 1 / (dispersion - 1);
    double r = lambda * p;

    if (dispersion == 1)
    {
        return Poisson.Sample(lambda);
    }
    else
    {
        return NegativeBinomial.Sample(r, p);
    }
}

What I've found is that this works well for low values of lambda. As soon as I try to sample with a lambda of 1000 and a dispersion parameter of 2, the code simply 'hangs', i.e. method keeps running but no value is returned. I've even looped through this method to test various combinations of input parameters (lambda from 1 to 1000, dispersion = 2), and the code 'hangs' at different combinations every time. Sometimes it'll sample for all combinations up to lambda = 750, other times up to lambda = 500. This happens by simply re-running the console app and making no code changes.

I've included the 'IsValidParameterSet' check before every run, and even though the parameters are considered valid, the sample is still not generated. To further test whether the input parameters are valid, I've tested the same parameters with the NegativeBinomial.CDF method at the 50th percentile, and it returns a value every time.

Is there an error in the NegativeBinomial.Sample method? If not, how do I fix this problem?

Mlambrechts avatar Jun 01 '15 10:06 Mlambrechts

Thanks for reporting this.

The current implementation is indeed not suitable for these parameters, as the acceptance area is not only getting very small, it is actually getting to zero (floating point precision) resulting in an infinite loop

Tasks:

  • [ ] detect such cases and make sure we never end up in an infinite loop. This applies to a few other discrete distributions as well
  • [ ] fallback to a modified sampling approach depending on the parameter range (see Poisson for example).
  • [ ] NegativeBinomial, Poisson and a few other discrete distributions are currently excluded from the sample shape unit test. Add them back so we have all distributions covered again.

cdrnet avatar Jun 04 '15 05:06 cdrnet

I noticed this is still an issue.

How about using the efficient methods described here: http://link.springer.com/article/10.1007/BF02293108

They may not be as accurate as the one you're using now (they may be though, not quite sure) but I think they'd be a good replacement or at very least good fallback methods for this kinda case.

jjdelvalle avatar Dec 06 '16 15:12 jjdelvalle

Related: #455

cdrnet avatar Dec 07 '16 06:12 cdrnet

There are several issues here:

  1. The code provided by @Mlambrechts is wrong, parameters r and p need to be determined as double p = 1 / dispersion and double r = lambda / (dispersion - 1)
  2. p = 0 actually should be considered to be an invalid parameter, since the distribution is not defined in this case (asymptotically, it reaches infinity, which cannot be represented as integer). In the current implementation, p = 0 not only leads to unhandled undefined mathematical operations (like division by zero), but also to an endless loop in the sampling method. I fixed this in #960 by disallowing p = 0. Note that the run time for sampling still goes to infinity when p approaches zero. Further improvements are welcome.

Arlofin avatar Sep 29 '22 23:09 Arlofin