ConjugatePriors.jl icon indicating copy to clipboard operation
ConjugatePriors.jl copied to clipboard

Fixed posterior for NormalWishart.

Open kaselby opened this issue 7 years ago • 25 comments

The posterior calculations for most of this package are based on the paper Conjugate Bayesian analysis of the Gaussian distribution by Kevin Murphy. This paper has several typos related to the Normal-Wishart distribution. The ConjugatePriors package fixes the typos related to the pdf of the Normal-Wishart (though I would note that right now the pdf function will not run without errors for this distribution, as it relies on a function called IsApproxSymmetric that does not appear to exist - but that is a separate issue), but it does not fix the typos in the posterior. The paper cites the posterior update for the scale matrix as being: T_n = T_0 + S + k_0/k_n * n * (mu_0 - xhat)(mu0-xhat)' The correct update is inv(T_n) = inv(T_0) + S + k_0/k_n * n * (mu_0 - xhat)(mu0-xhat)' See https://stats.stackexchange.com/questions/153241/derivation-of-normal-wishart-posterior/324925#324925 for details.

kaselby avatar Apr 17 '18 22:04 kaselby

Just a comment that I have tested this with a MCMC sampler I am using. When I use the existing posterior update, the sampler does not converge, and gives nonsensical results. When I use the update I added here, it converges quickly to the expected answer.

kaselby avatar Apr 18 '18 14:04 kaselby

I checked with https://en.wikipedia.org/wiki/Conjugate_prior#Continuous_distributions, so I agree that this is correct. Is there a way that we could interchange last Cholesky und inversion and return a Wishart which is parametrized differently?

mschauer avatar May 02 '18 15:05 mschauer

After looking at this further, it seems like the T that is intended to be stored in the Normal-Wishart object here is the inverse scale matrix for the Wishart, rather than the scale matrix itself (at least judging by the implementation of the pdf and posterior functions). In this case, the existing posterior update and pdf calculation for the scale matrix should be retained, and they are correct as is. However, T should be inverted before passing it to the Wishart in the rand function. (Also, this should probably be better documented, as it is not the standard parametrization of the Normal-Wishart)

kaselby avatar May 02 '18 20:05 kaselby

Great! This is the original implementation: https://github.com/JuliaStats/Distributions.jl/pull/140/files Maybe @nfoti remembers

mschauer avatar May 02 '18 21:05 mschauer

I believe I updated the PR to reflect the inverse scale matrix parametrization. It would be good to have someone look it over though to make sure I didn't miss anything. I believe what I have now is the same as the original formulation before my pull request, except for a change to the rand function.

kaselby avatar May 02 '18 21:05 kaselby

After checking again, the pdf function actually did need to be changed for this parametrization. I also fixed a syntax error in logpdf, and commented out a line which called the "isapproxsymmetric" function, which is not a function that appears to exist as far as I can tell.

kaselby avatar May 05 '18 15:05 kaselby

Apologies for making so many commits. I have now gone through and thoroughly tested this and am confident it is now right.

kaselby avatar May 05 '18 16:05 kaselby

Don't worry. Can we cook up a test which shows that this is right and the version on master is wrong?

mschauer avatar May 05 '18 17:05 mschauer

I am working on writing something up, I will hopefully post it later today.

kaselby avatar May 05 '18 18:05 kaselby

One suggestion would be to use in test that the one-dimensional case falls back to the InverseGamma/Gamma distribution, so they should give corresponding results.

mschauer avatar May 06 '18 14:05 mschauer

Thanks for looking at this! I agree that this parametrization is very weird. Might also be worth adding a docstring so that' the parametrization is documented in the help file at least.

kleinschmidt avatar May 21 '18 21:05 kleinschmidt

To be honest I am still not entirely sure how best to approach the problem of how to parametrize this - but no matter how we decide it is best to do it, the current implementation needs to be fixed. I apologize for not getting back to you on the testing issue, I have been very busy recently so haven't had time to look at it.

kaselby avatar May 21 '18 23:05 kaselby

I agree that the parametrization can be fixed later, but tests are critical before we merge. Even something as simple as calling the functions on a few values would have caught many of these bugs... Thanks for your work on this!

kleinschmidt avatar May 30 '18 16:05 kleinschmidt

Also, i'm looking the Normal-Inverse-Wishart and seeing a lot of the same problems (with the old function names, checking for symmetry, etc.). Would you be willing to fix those as well? If not let's merge this after adding some minimal tests (at a minimum just to make sure this code can be executed without errors)

kleinschmidt avatar May 30 '18 16:05 kleinschmidt

ah I found the source of that function (as well as another one that's used by NIW):

https://github.com/JuliaStats/Distributions.jl/blob/2fc223f72a2df4934c47e5701b9afc8119e0448f/src/utils.jl#L117-L137

kleinschmidt avatar May 30 '18 16:05 kleinschmidt

I'll try to get to this today and look over everything again. Sorry I've been dragging my heels, I've just been caught up in other work.

kaselby avatar May 30 '18 17:05 kaselby

No worries! I'm in the same boat :)

kleinschmidt avatar May 30 '18 17:05 kleinschmidt

@krylea I have a bit of time to work on this now actually; is it okay if I push more commits to this PR?

kleinschmidt avatar May 30 '18 19:05 kleinschmidt

Go for it!

kaselby avatar May 31 '18 01:05 kaselby

After taking another look, I believe the alternate parameterization we were considering here is actually the parameterization for the Normal INVERSE Wishart. I returned to the original parameterization, and added the inversions that were missing in the base code.

kaselby avatar Jun 14 '18 16:06 kaselby

What is the status here?

mschauer avatar Sep 06 '18 07:09 mschauer

ping @krylea

matbesancon avatar Jul 10 '19 16:07 matbesancon

@johnczito Would this fall into your area of expertise, be interesting for you?

mschauer avatar Aug 16 '19 10:08 mschauer

Sure. I guess the first thing that comes to mind is that NormalWishart should have a field W that simply stores Wishart(nu, inv(Tchol)). Also, why shouldn't logpdf just be something like this?

function logpdf(nw::NormalWishart, x::Vector{T}, Lam::Matrix{T}) where T<:Real
    Lsym = PDMat(Symmetric(inv(Lam) ./ nw.kappa))
    logpdf(Wishart(nw.nu, inv(nw.Tchol)), Lam) + logpdf(MvNormal(nw.mu, Lsym), x)
end

johnczito avatar Aug 16 '19 12:08 johnczito

^ That at least suggests a unit test you'd want to implement, in addition to comparing against the NormalGamma.

johnczito avatar Aug 16 '19 12:08 johnczito