Distributions.jl Use `PDMat` variates for `LKJ`, `Wishart` and `InverseWishart`

We have LKJCholesky as a distribution of Cholesky factors such that the corresponding unfactorized matrix follows the LKJ distribution of the same parameters. This is convenient since (copied from the docstring)

Sampling `LKJCholesky` is faster than sampling `LKJ`, and often having the correlation
matrix in factorized form makes subsequent computations cheaper as well.

The same would be true for e.g. Wishart and InverseWishart (ref https://arxiv.org/abs/2310.15884).

However, one drawback of LKJCholesky is that many statistics of interest such as mean etc. could only be defined if it is identified with the corresponding LKJ distribution: https://github.com/JuliaStats/Distributions.jl/pull/1938#pullrequestreview-2550359523

If you're actually interested in the LKJ (or Wishart or InverseWishart) distribution AND would like variates to be readily (and efficiently!) factorized, IMO a better approach would be to use PDMat variates, i.e., to directly sample the Cholesky factors (as in LKJCholesky) but return the unfactorized matrix (with readily available factorization) as a PDMat.

Jan 16 '25 14:01 devmotion

I like this plan. Since changing the variate-type of LKJ would be breaking, a less satisfactory but still non-breaking way would be to 1) add PDMatVariate 2) add LKJPDMat, and 3) deprecate LKJCholesky. In the next breaking release, LKJPDMat could be renamed to LKJ, and LKJCholesky could be removed.

Jan 16 '25 17:01 sethaxen

This proposal would supersede #1787

Jan 16 '25 17:01 sethaxen

@devmotion what do you think of the non-breaking intermediate proposal in https://github.com/JuliaStats/Distributions.jl/issues/1939#issuecomment-2596335115?

Apr 23 '25 12:04 sethaxen

I think it would be good to make these improvements right away, without the delay and discussions around a breaking release.

Apr 23 '25 12:04 devmotion

Okay, I can put together a PR. A few thoughts:

We need to decide what measure the logpdf is defined wrt to. Typically for PD matrices, the log-density is defined wrt one of the triangles or the strict triangle, while typically for Cholesky factors the log-density is defined wrt one of the triangles or strict triangles of the Cholesky factor. We currently use the former for LKJ and the latter for LKJCholesky. I'm inclined to use the former for LKJPDMat and friends.
While we should support PDMat in the first implementation, it might be useful to later also when possible implement methods for ScalMat, PDiagMat, PDSparseMat, etc.

Apr 23 '25 12:04 sethaxen

Should AbstractPDMatVariate subtype MatrixVariate or be a separate subtype of VariateForm?

Apr 24 '25 16:04 sethaxen