ConjugatePriors.jl
ConjugatePriors.jl copied to clipboard
Fixed posterior for NormalWishart.
The posterior calculations for most of this package are based on the paper Conjugate Bayesian analysis of the Gaussian distribution by Kevin Murphy. This paper has several typos related to the Normal-Wishart distribution. The ConjugatePriors package fixes the typos related to the pdf of the Normal-Wishart (though I would note that right now the pdf function will not run without errors for this distribution, as it relies on a function called IsApproxSymmetric that does not appear to exist - but that is a separate issue), but it does not fix the typos in the posterior. The paper cites the posterior update for the scale matrix as being: T_n = T_0 + S + k_0/k_n * n * (mu_0 - xhat)(mu0-xhat)' The correct update is inv(T_n) = inv(T_0) + S + k_0/k_n * n * (mu_0 - xhat)(mu0-xhat)' See https://stats.stackexchange.com/questions/153241/derivation-of-normal-wishart-posterior/324925#324925 for details.
Just a comment that I have tested this with a MCMC sampler I am using. When I use the existing posterior update, the sampler does not converge, and gives nonsensical results. When I use the update I added here, it converges quickly to the expected answer.
I checked with https://en.wikipedia.org/wiki/Conjugate_prior#Continuous_distributions, so I agree that this is correct. Is there a way that we could interchange last Cholesky und inversion and return a Wishart which is parametrized differently?
After looking at this further, it seems like the T that is intended to be stored in the Normal-Wishart object here is the inverse scale matrix for the Wishart, rather than the scale matrix itself (at least judging by the implementation of the pdf and posterior functions). In this case, the existing posterior update and pdf calculation for the scale matrix should be retained, and they are correct as is. However, T should be inverted before passing it to the Wishart in the rand function. (Also, this should probably be better documented, as it is not the standard parametrization of the Normal-Wishart)
Great! This is the original implementation: https://github.com/JuliaStats/Distributions.jl/pull/140/files Maybe @nfoti remembers
I believe I updated the PR to reflect the inverse scale matrix parametrization. It would be good to have someone look it over though to make sure I didn't miss anything. I believe what I have now is the same as the original formulation before my pull request, except for a change to the rand function.
After checking again, the pdf function actually did need to be changed for this parametrization. I also fixed a syntax error in logpdf, and commented out a line which called the "isapproxsymmetric" function, which is not a function that appears to exist as far as I can tell.
Apologies for making so many commits. I have now gone through and thoroughly tested this and am confident it is now right.
Don't worry. Can we cook up a test which shows that this is right and the version on master is wrong?
I am working on writing something up, I will hopefully post it later today.
One suggestion would be to use in test that the one-dimensional case falls back to the InverseGamma/Gamma distribution, so they should give corresponding results.
Thanks for looking at this! I agree that this parametrization is very weird. Might also be worth adding a docstring so that' the parametrization is documented in the help file at least.
To be honest I am still not entirely sure how best to approach the problem of how to parametrize this - but no matter how we decide it is best to do it, the current implementation needs to be fixed. I apologize for not getting back to you on the testing issue, I have been very busy recently so haven't had time to look at it.
I agree that the parametrization can be fixed later, but tests are critical before we merge. Even something as simple as calling the functions on a few values would have caught many of these bugs... Thanks for your work on this!
Also, i'm looking the Normal-Inverse-Wishart and seeing a lot of the same problems (with the old function names, checking for symmetry, etc.). Would you be willing to fix those as well? If not let's merge this after adding some minimal tests (at a minimum just to make sure this code can be executed without errors)
ah I found the source of that function (as well as another one that's used by NIW):
https://github.com/JuliaStats/Distributions.jl/blob/2fc223f72a2df4934c47e5701b9afc8119e0448f/src/utils.jl#L117-L137
I'll try to get to this today and look over everything again. Sorry I've been dragging my heels, I've just been caught up in other work.
No worries! I'm in the same boat :)
@krylea I have a bit of time to work on this now actually; is it okay if I push more commits to this PR?
Go for it!
After taking another look, I believe the alternate parameterization we were considering here is actually the parameterization for the Normal INVERSE Wishart. I returned to the original parameterization, and added the inversions that were missing in the base code.
What is the status here?
ping @krylea
@johnczito Would this fall into your area of expertise, be interesting for you?
Sure. I guess the first thing that comes to mind is that NormalWishart should have a field W that simply stores Wishart(nu, inv(Tchol)). Also, why shouldn't logpdf just be something like this?
function logpdf(nw::NormalWishart, x::Vector{T}, Lam::Matrix{T}) where T<:Real
Lsym = PDMat(Symmetric(inv(Lam) ./ nw.kappa))
logpdf(Wishart(nw.nu, inv(nw.Tchol)), Lam) + logpdf(MvNormal(nw.mu, Lsym), x)
end
^ That at least suggests a unit test you'd want to implement, in addition to comparing against the NormalGamma.