rstanarm
rstanarm copied to clipboard
Default prior on the negative binomial is very informative
Summary:
The default prior for the over-dispersion parameter of the negative binomial likelihood puts a lot of prior mass on large amounts of over-dispersion
Description:
This is found in src/stan_files/count.stan
. The likelihood is invoked on lines 99-100
and the prior on the variable aux
is set on lines 108-117
.
The variance of the neg_binomial2
is given by
variance = mean(1 + mean / aux)
where aux
has a half-normal, half-t, or exponential prior. This means that most of the prior mass is on aux<1
which can lead to a great deal of over-dispersion.
The fix is to put the same prior on 1/aux
or, even better, 1/sqrt(aux)
. The inverse square root comes from noting that you can specify a negative binomial as a poisson with a random mean with a Gamma(aux,aux)
distribution. This has mean 1 and variance 1/aux
. This suggests that 1/sqrt(aux)
is somewhat like a standard deviation.
An outline of a different, more mathematical approach based on PC priors and Kullback-Leibler divergences is given in the second last section (Can't help lovin that man) of this blog post: http://andrewgelman.com/2018/04/03/justify-my-love/
Reproducible Steps:
Not applicable
RStanARM Version:
Not applicable. This is true on the current master branch https://github.com/stan-dev/rstanarm/commit/f11d6bf92b6c78ce64896caf78c2237ebd015abb
R Version:
Not applicable
Operating System:
Not applicable
That sounds like a good idea
On Wed, Apr 25, 2018 at 5:20 PM, dpsimpson [email protected] wrote:
Summary:
The default prior for the over-dispersion parameter of the negative binomial likelihood puts a lot of prior mass on large amounts of over-dispersion Description:
This is found in src/stan_files/count.stan. The likelihood is invoked on lines 99-100 and the prior on the variable aux is set on lines 108-117.
The variance of the neg_binomial2 is given by variance = mean(1 + mean / aux) where aux has a half-normal, half-t, or exponential prior. This means that most of the prior mass is on aux<1 which can lead to a great deal of over-dispersion.
The fix is to put the same prior on 1/aux or, even better, 1/sqrt(aux). The inverse square root comes from noting that you can specify a negative binomial as a poisson with a random mean with a Gamma(aux,aux) distribution. This has mean 1 and variance 1/aux. This suggests that 1/sqrt(aux) is somewhat like a standard deviation.
An outline of a different, more mathematical approach based on PC priors and Kullback-Leibler divergences is given in the second last section (Can't help lovin that man) of this blog post: http://andrewgelman.com/2018/ 04/03/justify-my-love/ Reproducible Steps:
Not applicable RStanARM Version:
Not applicable. This is true on the current master branch f11d6bf https://github.com/stan-dev/rstanarm/commit/f11d6bf92b6c78ce64896caf78c2237ebd015abb R Version:
Not applicable Operating System:
Not applicable
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/275, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqu-X9zYR-kmXOAF8ZhqOjSbjcd3iks5tsOibgaJpZM4TkLI_ .
@dpsimpson Any suggestions for what to call this parameter? We've been referring to aux
as "reciprocal_dispersion"
in the user-facing doc and output for the negative binomial model, but we'd need a different name if putting the prior on 1/sqrt(aux)
.
1/aux
makes sense to be called "dispersion". Maybe call the inverse
square root "dispersion deviation"? Or is that too weird?
On 26 April 2018 at 10:48, Jonah Gabry [email protected] wrote:
@dpsimpson https://github.com/dpsimpson Any suggestions for what to call this parameter? We've been referring to aux as "reciprocal_dispersion" in the user-facing doc and output for the negative binomial model, but we'd need a different name if putting the prior on 1/sqrt(aux).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/275#issuecomment-384668348, or mute the thread https://github.com/notifications/unsubscribe-auth/AOKBBkN8uoloyiT3LCAM5JXHr_0fQ0erks5tsd5DgaJpZM4TkLI_ .
“Excess dispersion” or “dispersion moderator”?
On Apr 26, 2018, at 11:21 AM, dpsimpson [email protected] wrote:
1/aux
makes sense to be called "dispersion". Maybe call the inverse square root "dispersion deviation"? Or is that too weird?On 26 April 2018 at 10:48, Jonah Gabry [email protected] wrote:
@dpsimpson https://github.com/dpsimpson Any suggestions for what to call this parameter? We've been referring to aux as "reciprocal_dispersion" in the user-facing doc and output for the negative binomial model, but we'd need a different name if putting the prior on 1/sqrt(aux).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/275#issuecomment-384668348, or mute the thread https://github.com/notifications/unsubscribe-auth/AOKBBkN8uoloyiT3LCAM5JXHr_0fQ0erks5tsd5DgaJpZM4TkLI_ .
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/275#issuecomment-384680964, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdNluOC86OGrildxFvmnREn7fbXqzBJks5tseXcgaJpZM4TkLI_.
I like those. Personally I like "dispersion moderator" but I like "excess dispersion" too and that will probably be a more intuitive name for users.
I am not sure it matters that much what it is called because in the output we can still make it reciprocal dispersion like in MASS::glm.nb. But it would become a transformed parameter and we have to explain that the prior on the primitive is on the reciprocal square root of that.
On Fri, Apr 27, 2018 at 3:34 PM, Jonah Gabry [email protected] wrote:
I like those. Personally I like "dispersion moderator" but I like "excess dispersion" too and that will probably be a more intuitive name for users.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/275#issuecomment-385072319, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqrdLU_JiqlaKxTBp-s8ExDcU4HHCks5ts3LDgaJpZM4TkLI_ .
Yeah I was thinking about that. On the one hand it makes it super straightforward for us to do this if we do it that way, but I’m hesitant to have too many of these special cases where the prior interpretation gets confusing for users. I think it would be less confusing for them to place the prior directly on the “excess dispersion” or whatever we call it. Otherwise prior_aux for stan_glm has a special interpretation only for negative binomial models. That’s even more confusing than the prior intercept being specified after centering predictors because at least that is done consistently regardless of the family. The downside to doing it the way I’m proposing is of course that the output changes for the user. So not ideal either way.
You always have to assume that less than 1% are going to change the default prior, no matter what it is and no matter what parameterization it pertains to. So, there are up to 99% of people who are not going to change it but are liable to misinterpret whether they have severe overdispersion or no overdispersion at all if the parameter in the output differs from what is in MASS::glm.nb, stats::dnbinom, and neg_binomial_2 in Stan Math. The 1% who want to change the default prior can figure out what it is on.
On Fri, Apr 27, 2018 at 7:08 PM, Jonah Gabry [email protected] wrote:
Yeah I was thinking about that. On the one hand it makes it super straightforward for us to do this if we do it that way, but I’m hesitant to have too many of these special cases where the prior interpretation gets confusing for users. I think it would be less confusing for them to place the prior directly on the “excess dispersion” or whatever we call it. Otherwise prior_aux for stan_glm has a special interpretation only for negative binomial models. That’s even more confusing than the prior intercept being specified after centering predictors because at least that is done consistently regardless of the family. The downside to doing it the way I’m proposing is of course that the output changes for the user. So not ideal either way.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/275#issuecomment-385115774, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqvKszugEn1x0q0BGBU3OaAF3USe-ks5ts6TygaJpZM4TkLI_ .
Yeah you’re right. I’d prefer to keep the interpretations consistent, but the minority of users who want to change the defaults will be more inclined to thoroughly read the documentation about the parameterization.
So we’ll go with just changing it internally and documenting it
Hi far as I can see this did not get changed - I am not skilled enough to fix it would be great if someone did!