stan
stan copied to clipboard
[FR] low rank HMC?
Summary:
It would be nice to have versions of HMC/NUTS that support a low-rank plus diagonal metric a la L-BFGS.
This will require the following.
-
low_rank_e_metricandlow_rank_e_pointlike dense_e_metric.hpp and dense_e_point.hpp which would use the code from pathfinder for getting the inverse hessian from here -
low_rank_e_nutsandadapt_low_rank_e_nutsclass like diag_e_nuts.hpp and adapt_diag_e_nuts.hpp
There's more information about this in @bbbales2 thesis, and refined in a joint arXiv paper with @pourzanj and @avehtari:
Current Version:
v2.29.0
“Low rank” isn’t going to be precise enough for a single signature. In addition to all of the possible ranks between 1 and N there are multiple low-rank patterns that one might consider. Moreover each of these patterns would require an appropriate adaptation routine to match to the global covariance.
On Feb 17, 2022, at 12:12 PM, Steve Bronder @.***> wrote:
Summary:
Making an issue for a thing brought up in the Stan meeting. It would be nice to have a low rank method for NUTS. From the meeting we sorted out this involves adding these parts so far and will update it once we discuss it more.
low_rank_e_metric and low_rank_e_point like dense_e_metric.hpp https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/hamiltonians/dense_e_metric.hpp and dense_e_point.hpp https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/hamiltonians/dense_e_point.hpp which would use the code from pathfinder for getting the inverse hessian from here https://github.com/stan-dev/stan/blob/feature/multi-path/src/stan/services/pathfinder/single.hpp#L444 low_rank_e_nuts and adapt_low_rank_e_nuts class like diag_e_nuts.hpp https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/nuts/diag_e_nuts.hpp and adapt_diag_e_nuts.hpp https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/nuts/adapt_diag_e_nuts.hpp Description:
Describe the issue as clearly as possible.
Reproducible Steps:
Please report steps to reproduce the issue. If it's not possible to reproduce, please include a description of how you discovered the issue.
If you have a reproducible example, please include it.
Current Output:
The current output. Knowing what is the current behavior is useful.
Expected Output:
Describe what you expect the output to be. Knowing the correct behavior is also very useful.
Additional Information:
Provide any additional information here.
Current Version:
v2.29.0
— Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/3102, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALU3FRGXJGGY3PSN2B7SBTU3UUBNANCNFSM5OVJ2UWA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you are subscribed to this thread.
I believe Steve's suggesting the specific factor-based representation of a symmetric positive definite matrix as
Sigma = u' * v + diag(w)
where u and v are K x N matrices and w is an N-vector.
I agree with @betanalpha that this is a hornet's nest of design if we want to make it more general. I can think of at least two other basic structures that might be worth considering,
- block-diagonal, and
- HODLR (hier off-diag low rank).
And then we can add extra irregular sparse structure on top of this by addition. Perhaps transformed from a sparse symmetric representation using matrix exp.