AdvancedMH.jl
AdvancedMH.jl copied to clipboard
Adaptive Metropolis algorithm
I've implemented a simple version of the standard AM algorithm as a small extension to #39 (this PR is independent of #39). The pull request features a multivariate Gaussian proposal which uses a rescaled version of the posterior covariance (Haario et al.'s famous 2.38 rule) + a small epsilon term for positive-definiteness and initial exploration. In order to update the proposal I had to create a new sampler class, AMSampler
, which is basically MetropolisHastings
but updates the proposal at each sampling step via the trackstep
function. The default implementation of trackstep
does nothing.
The update mechanism is slightly different from that in #39: this one performs adaptation in AbstractMCMC.step
whereas #39 does it in propose
. I think the new AMSampler
class should be able to handle both, and since it consists of minimal modifications to the original MetropolisHastings
it could provide a general interface for adaptive MH algorithms.
Notes:
- Proposals are shared between threads - how can this be changed?
- I haven't implemented all overloads of
propose
forAMSampler
but this should not be an issue. - This PR would benefit from #38 to make the
trackstep
interface more uniform. - The test example is minimal, I would welcome any suggestions!
Is it possible to just add a new proposal and extend the existing sampler, if needed, but not add a new sampler? I think this would simplify the code and also be easier if one wants to combine different proposals.
I am also wondering if trackstep
is needed at all - maybe one could just update the proposal right before proposing a new sample in the propose
step?
Is it possible to just add a new proposal and extend the existing sampler, if needed, but not add a new sampler? I think this would simplify the code and also be easier if one wants to combine different proposals.
I didn't want to modify existing code in order to not break something but the changes could be trivially implemented in MetropolisHastings! I can move things there if that'd be better.
I am also wondering if
trackstep
is needed at all - maybe one could just update the proposal right before proposing a new sample in thepropose
step?
I wanted to leave users the option of querying the sampler freely for new proposals without changing the state, hence why I thought updating the proposal in step
would be a good idea. Furthermore the accept/reject decision is taken in step
, so the proposed samples might not end up in the chain - the AM algorithm only uses samples in the chain to do its adaptation.
Furthermore the accept/reject decision is taken in step, so the proposed samples might not end up in the chain - the AM algorithm only uses samples in the chain to do its adaptation.
My suggestion was not to update the proposal based on the proposed sample but on the previous sample - this should always be the accepted sample. And since the previous sample is available in the propose
step, I thought this could be sufficient.
I see, so like the approach in #39 ! I personally think having a separate callback interface for adaptation would be cleaner, and being able to track acceptances/rejections might be useful for some algorithms. And compared to the accepted!
callback in #39 this should be a bit more consistent/flexible.
I'm not strictly against adding such a function but there is one major problem: transitions contain all parameters but only for a subset of them (e.g. just one parameter) one might want to use the adaptive proposal. Thus in general it is not correct to call trackstep
with the AMProposal
and a Transition
, it should be called only with the corresponding sample.
One way to achieve this is to handle it in propose
. Another one would be to iterate through samples and proposals.
In particular due to these mixed cases, I think it would be better to only add a new proposal but not a new sampler.
Another advantage of handling it in the propose
step is that you don't have to construct the MvNormal
proposal, which probably is not very efficient.
Slightly unrelated thought:
being able to track acceptances/rejections might be useful
I think probably one would not want to keep track of these statistics in a separate trackstep
function - otherwise every proposal would have to reimplement the tracking functionality. Probably it would be better to hardcode it in the Transition
object and the implementation of step
, and then possibly forward it to the proposals if needed.
Slightly unrelated thought:
being able to track acceptances/rejections might be useful
I think probably one would not want to keep track of these statistics in a separate
trackstep
function - otherwise every proposal would have to reimplement the tracking functionality. Probably it would be better to hardcode it in theTransition
object and the implementation ofstep
, and then possibly forward it to the proposals if needed.
I agree, which is why I think implementing #38 would be useful! It shouldn't be a big change but probably belongs in another pull request
I'm not strictly against adding such a function but there is one major problem: transitions contain all parameters but only for a subset of them (e.g. just one parameter) one might want to use the adaptive proposal. Thus in general it is not correct to call trackstep with the AMProposal and a Transition, it should be called only with the corresponding sample.
One way to achieve this is to handle it in propose. Another one would be to iterate through samples and proposals.
I have just merged AMSampler
into MetropolisHastings
. I haven't seen multiple independent adaptive proposals in the wild, but this implementation just calls trackstep for every proposal in case of AbstractArray
s or NamedTuple
s. Note that acceptances/rejections are not determined by a single proposal, but by a combination of all of them so having more than one adaptive proposal might not work very well.
Any way we could reuse the Welford implementation from AdvancedHMC?
By the way, it would be easy to combine #39 with this pull request, but we should maybe fix nomenclature for the methods as the names almost overlap - any suggestions?
Any way we could reuse the Welford implementation from AdvancedHMC?
Only if it's moved to a separate package I assume.
By the way - and this is embarassing - #39 already implemented pretty much the same adaptive algorithm. I didn't realise this until now, so this PR is a bit of a duplicate really. Apologies to @arzwa.
There's a subpackage called Adaptation
in AdvancedHMC
- for now we could probably borrow code from there (this should be permitted by the BSD-3 and MIT Licences). So we could merge that computation of the sample covariance matrix with this and #39.
One idea would be basically copying the interface in AdvancedHMC
, ie. having an abstract Adaptor
struct with initialize!(adaptor, proposal, initial_sample, model)
and update!(adaptor, proposal, sample_prev, sample, model, accepted)
. That way we could avoid creating new Proposal
structs and cleanly separate the adaptation logic.
How about moving the AdvancedHMC mass matrix adaption code into AbstractMCMC.Adaption
submodule?
How about moving the AdvancedHMC mass matrix adaption code into
AbstractMCMC.Adaption
submodule?
It might make sense to define an abstract Adaptation
interface in AbstractMCMC
. As far as I know AdvancedHMC
does not use AbstractMCMC
though.
There is a PR in progress - see https://github.com/TuringLang/AdvancedHMC.jl/pull/259
By the way - and this is embarassing - #39 already implemented pretty much the same adaptive algorithm. I didn't realise this until now, so this PR is a bit of a duplicate really. Apologies to @arzwa.
No problem at all, that PR got stalled due to myself not really finding the time to finish it off decently (it worked for what I needed, and I got sidetracked).
I created a new branch adaptive
(available on my GitHub) which implements a more generic interface (I don't want to spam PRs).
Main changes:
-
MetropolisHastings
has a second field namedadaptor
-
AbstractAdaptor{ProposalType}
as an abstract type for adaptors with methodsinitialize!
andupdate!
-
NoAdaptor{PT} <: AbstractAdaptor{PT}
as a default no-op adaptor - Adaptors can be combined for arrays/named tuples of proposals (yielding an array/named tuple of adaptors)
-
AMAdaptor
for symmetric MvNormal random walk proposals
Not sure if the type interface is the cleanest, I would appreciate any feedback!
Note: I'm not sure if we really want to support multiple adaptives for arrays/tuples of proposals, this feature could be removed...
Intuitively I would have thought adaptor
should be part of an AdaptiveProposal
but not a general property of MetropolisHastings
. That seems a bit cleaner but maybe you noticed some advantages if it is part of the sampler?
Not really to be honest! I tried to see if it would be cleaner but at least my new "solution" is more complicated than what we have already. Unless you or @cpfiffer see any advantages in separating out the adaptor logic (I don't really) we can just keep it as it is.
Unless you or @cpfiffer see any advantages in separating out the adaptor logic (I don't really) we can just keep it as it is.
For right now, I think it's fine to just go as-is. We can add in more general adaptation interfaces later on.
I have uploaded a new version that uses Welford's algorithm. The code now more or less covers the AdaptiveMvNormal
features in #39 . The current adaptation interface is via trackstep!
(added a bang).
Cool. I like it. I've turned on the test suite so we can see how the tests run.
Cool. I like it. I've turned on the test suite so we can see how the tests run.
There was an error due to my using a different version of LinearAlgebra.jl, the AdaptiveMH
test should now run correctly (hopefully)
I'll approve for now. Could you add a little bit of usage info to the README file?
@arzwa We should also incorporate your univariate adaptive proposal. Any suggestions for a name? We can also change AMProposal
.
@kaandocal Are you still interested in getting this PR merged?
We should probably get this PR out of the way. I did not pursue this much further because I had trouble getting it to work on my sampling problems - this approach needs a lot of samples to get a decent estimate of the covariance matrix and seems quite fickle in practice. Due to the nature of MH it just takes a while to explore the target distribution even if you have a decent covariance matrix... AdvancedHMC
does a fantastic job at adaptation in my experience, and this is nowhere near that. While it is quite different @ParadaCarleton I do recommend having a look at Emcee in the meanwhile.
We should probably get this PR out of the way. I did not pursue this much further because I had trouble getting it to work on my sampling problems - this approach needs a lot of samples to get a decent estimate of the covariance matrix and seems quite fickle in practice. Due to the nature of MH it just takes a while to explore the target distribution even if you have a decent covariance matrix...
AdvancedHMC
does a fantastic job at adaptation in my experience, and this is nowhere near that. While it is quite different @ParadaCarleton I do recommend having a look at Emcee in the meanwhile.
Yep, I totally get that.
The reason an AMH sampler would be useful is less to do with me wanting to use AMH itself, and more to do with AMH being a useful building block in other algorithms. I think @fipelle mentioned an interest in this earlier?