pymc
pymc copied to clipboard
Improve support for `dims` in `LKJCholeskyCov`
What is this PR about?
There are currently several pain points when using labeled dims withLKJCholeskyCov
:
- The distribution samples 2d matrices, but passing a pair of dimensions results in an error, because internally the distribution is represented in packed lower-triangular form.
- Any dimensions given to
LKJCholeskyCov
are not propagated to internally generated deterministics,{name}_std
and{name}_corr
. If a user wants these to be labeled, he has to pass a long dictionary toidata_kwargs
- After sampling, the 1's on the diagonal of all samples drawn from
{name}_corr
causes an error inarviz
when computing within-chain variance.
This PR tries to correct all three of these. Here is an example model under this PR:
from string import ascii_uppercase
n = 3
n_obs = 100
mean = np.zeros(n)
L = np.random.normal(size=(n, n))
cov = L @ L.T
data = np.random.multivariate_normal(mean=mean, cov=cov, size=(n_obs, ))
with pm.Model(coords={'dim':ascii_uppercase[:n],
'dim_aux':ascii_uppercase[:n]},
coords_mutable={'obs_idx':np.arange(n_obs, dtype='int')}) as mod:
sd_dist = pm.Exponential.dist(1)
chol, *_ = LKJCholeskyCov('chol', n=n, sd_dist=sd_dist, eta=1, dims=['dim', 'dim_aux'])
obs = pm.MvNormal('obs', mu=0, chol=chol, observed=data, dims=['obs_idx', 'dim'])
idata = pm.sample()
First, I pass two dimensions to LKJCholeskyCov
-- one for the columns, and one for the rows. This corresponds to the expectation that I am drawing from a matrix-valued random variable.
Internally, I take the Cartesian product between these two dims, and use the lower triangle of the resulting matrix to make and register a new coordinate: packed_tril_{name}
. This is then set as the dims on packed_chol
.
Next, only the upper triangle (excluding the diagonal) of the correlation matrix is stored in a deterministic. Another new coordinate is registered: corr_{name}
.
Finally, the first dim is used to add a labeled dimension to {name}_std
.
This results in the following graph:
Here is the result plotted with az.plot_trace
:
This PR still needs a bit of work, including:
- Unit tests for the new functionality
- Documentation
- The generated dimensions are added in a "hacky" way, I hope this can be improved
- The names on the "packed" dimensions of chol and cov are not great. It would be nice if a MultiIndex could be specified here, but I don't think it's currently possible without some changes (see here, but maybe this is out of date?)
It's also possible that the problem (3) is a problem on the arviz side of things, and should be fixed there instead of here. But in that case, it would still be nice to propagate the matrix dims to the full square correlation matrix.
Because of these points, I'm marking this as a draft PR. But I would still like feedback on the idea of automatically generating coords, or at least on how dim handling can be improved in LKJCholeskyCov
Checklist
- [ ] Explain important implementation details 👆
- [ ] Make sure that the pre-commit linting/style checks pass.
- [ ] Link relevant issues (preferably in nice commit messages)
- [ ] Are the changes covered by tests and docstrings?
- [ ] Fill out the short summary sections 👇
Major / Breaking Changes
- None
New features
- Allow
LKJCholeskyCov
to generate and register new modelcoords
corresponding to the distributions it internally registers
Bugfixes
- Allows plotting of generated correlation matrix in
az.plot_trace
, but that might not be something that should be fixed on the PyMC side. See discussion above.
Documentation
-None
Maintenance
-None
:books: Documentation preview :books:: https://pymc--6828.org.readthedocs.build/en/6828/