pints
pints copied to clipboard
Optimise performance of NUTS
The current implementation of the pints.NoUTurnMCMC
is at least 10 times slower than pints.HamiltonianMCMC
on a problem I tried (5 dimensional parameter space, default settings), which probably renders this implementation not very practical for most problems.
I believe we can speed up NUTS to almost c-like speed by compiling the internal updating methods (or even the entire helper class) of the sampler with numba. I am not sure whether this will allow us to still use the "yield" API, but maybe something worth sacrificing for significant speed ups?
Hi David. Did you use parameter transforms for NUTS? We’ve found that NUTS runs much faster when using them (which is what Stan does)...
On 8 Mar 2021, at 08:01, David Augustin [email protected] wrote:
The current implementation of the pints.NoUTurnMCMC is at least 10 times slower than pints.HamiltonianMCMC on a problem I tried (5 dimensional parameter space, default settings), which probably renders this implementation not very practical for most problems.
I believe we can speed up NUTS to almost c-like speed by compiling the internal updating methods (or even the entire helper class) of the sampler with numba. I am not sure whether this will allow us to still use the "yield" API, but maybe something worth sacrificing for significant speed ups?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi David. Did you use parameter transforms for NUTS? We’ve found that NUTS runs much faster when using them (which is what Stan does)... …
Hi Ben :) Yep I applied the usual log-transforms. Do you remember which transforms you used?
Interesting. I’m not sure what your priors are but if they’re uniform you could try the boundary transformation? Do you know how many steps the NUTS algorithm is taking?
On 8 Mar 2021, at 08:23, David Augustin [email protected] wrote:
Hi David. Did you use parameter transforms for NUTS? We’ve found that NUTS runs much faster when using them (which is what Stan does)... …
Hi Ben :) Yep I applied the usual log-transforms. Do you remember which transforms you used?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
I am using Gaussian priors truncated at 0. The boundary at zero should not be a problem because of the log-transforms no? I am able to infer the posteriors though with both, HMC and NUTS, it's just that NUTS is quite slow. Do you think transformations could yield any speed up? I was thinking that this more because of the more involved proposal computation in NUTS? I mean even in the example notebook with a 2d Gaussian, NUTS is almost 5 times slower: https://github.com/pints-team/pints/blob/master/examples/sampling/nuts-mcmc.ipynb.
Yep, the boundary at zero should be fine. When you say “slower”, it’d be good to know if this was in terms of effective samples per second and/or ESS per unit time...?
On 8 Mar 2021, at 08:38, David Augustin [email protected] wrote:
I am using Gaussian priors truncated at 0. The boundary at zero should not be a problem because of the log-transforms no? I am able to infer the posteriors through with both, the HMC and NUTS, it's just that NUTS is quite slow. Do you think transformations could yield any speed up? I was thinking that this more because of the more involved proposal computation in NUTS? I mean even the example notebook with a 2d Gaussian, NUTS is almost 5 times slower: https://github.com/pints-team/pints/blob/master/examples/sampling/nuts-mcmc.ipynb.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
I'll try to run NUTS for a little while to give you a better estimate of the ESS but just to give you an idea:
Running 2000 iterations of HMC takes about the same time as running 20 iterations of NUTS:
Ha, interesting! Yes, that may be fairly condemning. Your example may be a good one for the paper as it illustrates how we can easily switch between samplers in PINTS. Perhaps bring it up tomorrow in the meeting?
On Mon, Mar 8, 2021 at 11:30 AM David Augustin [email protected] wrote:
I'll try to run NUTS for a little while to give you a better estimate of the ESS but just to give you an idea: Running 2000 iterations of HMC takes about the same time as running 20 iterations of NUTS: [image: Screen Shot 2021-03-08 at 11 33 50 AM] https://user-images.githubusercontent.com/20031982/110315770-ffd2d400-8009-11eb-9aa4-6756e07f6f29.png
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pints-team/pints/issues/1314#issuecomment-792692592, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCILKBWAI6HBTB2GI7OF73TCSYNRANCNFSM4YY5FQXQ .
Even if NUTS has a good ESS per iteration ratio, the adaption just takes too long (the minimal number of adaption steps are currently 150, which translates on 8 cores in parallel to roughly 20 minutes for just the warm-up). That's crazy for 5 dimensions :D
Ha, interesting! Yes, that may be fairly condemning. Your example may be a good one for the paper as it illustrates how we can easily switch between samplers in PINTS. Perhaps bring it up tomorrow in the meeting? …
Exactly! The same problem with ACMC runs in 2 mins to full convergence!
@DavAug @ben18785 Do you feel this is an issue with PINTS that needs resolving, or can we close/move this discussion?
Yes, happy to close this!