Use `PDMat` variates for `LKJ`, `Wishart` and `InverseWishart`
We have LKJCholesky as a distribution of Cholesky factors such that the corresponding unfactorized matrix follows the LKJ distribution of the same parameters. This is convenient since (copied from the docstring)
Sampling `LKJCholesky` is faster than sampling `LKJ`, and often having the correlation
matrix in factorized form makes subsequent computations cheaper as well.
The same would be true for e.g. Wishart and InverseWishart (ref https://arxiv.org/abs/2310.15884).
However, one drawback of LKJCholesky is that many statistics of interest such as mean etc. could only be defined if it is identified with the corresponding LKJ distribution: https://github.com/JuliaStats/Distributions.jl/pull/1938#pullrequestreview-2550359523
If you're actually interested in the LKJ (or Wishart or InverseWishart) distribution AND would like variates to be readily (and efficiently!) factorized, IMO a better approach would be to use PDMat variates, i.e., to directly sample the Cholesky factors (as in LKJCholesky) but return the unfactorized matrix (with readily available factorization) as a PDMat.
I like this plan. Since changing the variate-type of LKJ would be breaking, a less satisfactory but still non-breaking way would be to 1) add PDMatVariate 2) add LKJPDMat, and 3) deprecate LKJCholesky. In the next breaking release, LKJPDMat could be renamed to LKJ, and LKJCholesky could be removed.
This proposal would supersede #1787
@devmotion what do you think of the non-breaking intermediate proposal in https://github.com/JuliaStats/Distributions.jl/issues/1939#issuecomment-2596335115?
I think it would be good to make these improvements right away, without the delay and discussions around a breaking release.
Okay, I can put together a PR. A few thoughts:
- We need to decide what measure the
logpdfis defined wrt to. Typically for PD matrices, the log-density is defined wrt one of the triangles or the strict triangle, while typically for Cholesky factors the log-density is defined wrt one of the triangles or strict triangles of the Cholesky factor. We currently use the former forLKJand the latter forLKJCholesky. I'm inclined to use the former forLKJPDMatand friends. - While we should support
PDMatin the first implementation, it might be useful to later also when possible implement methods forScalMat,PDiagMat,PDSparseMat, etc.
Should AbstractPDMatVariate subtype MatrixVariate or be a separate subtype of VariateForm?