stan
stan copied to clipboard
Feature request: low-rank automatic differentiation variational inference
Summary:
Continues a discussion with @avehtari from here. Distilled down: full-rank ADVI is constrained by memory. The mean-field approximation can be problematic for certain models. A sensible intermediate (a low-rank implementation) for certain models would be very helpful. Ong et al. (2017) described one possible implementation.
Description:
I'll briefly outline the mathematical approach of Ong et al., and leave the Stan-specific implementation details (most of which were kindly outlined in the preceding discussion) for the pull request. To generate the parameters of the model: if n
is the dimension of the parameters, and r
is the desired rank of our approximation, we draw eta = (z, eps)
from the r + n
dimensional identity Gaussian. Then zeta
is distributed according to N(mu, BB^T + diag(d^2))
where mu
and d
are n
-dimensional and B
is n x r
and constrained to be lower-triangular, and can be obtained from eta
by the reparameterization trick with the formula zeta = mu + Bz + d * eps
. zeta
is then transformed to the model parameters according to ADVI.
Additional info:
I've started working on an implementation and will open a PR now.
Current Version:
v2.19.1