nimble icon indicating copy to clipboard operation
nimble copied to clipboard

add Barker proposal sampler

Open paciorek opened this issue 1 year ago • 1 comments

WIP. For discussion with @danielturek @perrydv before merging.

This adds the Barker proposal sampler as discussed extensively in this NCT issue.

We will need to consider whether to simply add this as another sampler or to replace RW_block as our default block sampler.

See the NCT issue for todo items, primarily testing and documentation.

paciorek avatar Oct 08 '24 15:10 paciorek

I cancelled the workflow since there are no tests to run yet.

paciorek avatar Oct 08 '24 15:10 paciorek

One additional thing to consider is whether to use the simpler code without logDetJacobian and the additional nestedness of the gradient-related functions when no param transform is needed. That code is still in the sampler as it's how I originally wrote the sampler. It seems to be somewhat faster in cases where I compared (perhaps 10-20%). That said, I suppose the same question could be raised with HMC too.

paciorek avatar Oct 25 '24 18:10 paciorek

I have not implemented the scale/propCov history stuff that is in RW_block. I have implemented setScale and setPropCov.

paciorek avatar Oct 25 '24 19:10 paciorek

@paciorek

I like the use of the nimble option to control the default multivariate sampler assignment, and also the choice of name: MCMCuseBarkerAsDefaultMV.

If I'm not mistaken, some of the braces { and } are off in this PR. Is that possibly the case? If so, I'm going to push some minor changes to branch barker.

Definitely ok for not having the facilities for recording the scale and propCov history. Could always be added later, if there's a need.

@paciorek where is the "simpler code" you mentioned, avoiding the additional nestedness of the gradient-related functions? I didn't immediately see what you were referring to. If it really does provide a 10-20% speedup, when no parameter transforms are used, that's maybe worth considering..

danielturek avatar Oct 28 '24 16:10 danielturek

@danielturek the "simpler" code is to just use the jacobian method and not the gradient method.

I've also pushed testing, so this PR is now essentially complete, pending discussion.

paciorek avatar Dec 09 '24 20:12 paciorek

Test failure is something weird with scipen in test-optim.R only on R devel. Not sure if it's some change in R that we'll have to deal with.

In any event @danielturek I think this PR is complete and ready to merge in; let me know of any final comments.

paciorek avatar Dec 11 '24 16:12 paciorek

@paciorek I read through this. I did not sweat the algorithm details per se and it generally LGMT. Some minor comments:

  • A possibly cleaner alternative to !inherits(propCov[1,1], 'numeric')) could be storage.mode(propCpv)!="double".
  • In the adaptation, it would be computationally appealing if at some point the adaptation is simply done and we don't keep internally recording empirSamp over and over (an issue related to other adaptive samplers too).
  • I realize you are operating in a transformed parameter space so shouldn't hit boundaries, but would it still be useful to check if gradCurrent has any invalid values and bail out or error-trap if so?
  • In similar veins, if any of gradProposal is invalid, then the lpD steps don't need to happen. Would there be any benefit to doing the model$calculateDiff before the gradient(proposal) so that if model$calculateDiff yield rejection due to invalid calculations, the gradient step doesn't need to be done?
  • Also the split we do in other samplers between the prior and the rest of the calculations could be used, so that if the prior is invalid, we can reject immediately. I realize that "shouldn't" happen in a transformed parameter space, but would be more failsafe?
  • Also for model$calculateDiff, have you confirmed that gradient(proposal) does not change the cached logProb values in the model (which calculateDiff relies on)? I guess so because otherwise the sampler would probably not be working. In thinking about this I recall that we put care into this aspect of AD behavior but it may not be clearly documented or tested. I am going to make a separate note about that.

perrydv avatar Dec 12 '24 00:12 perrydv

Thanks @perrydv . I am reluctant to do checks and then bail out of later calcs because we don't expect failures, so I don't think there would be efficiency gains.

I am adding use of checkLogProb to the sampler once I merge that functionality here so that should address questions of invalid gradient values.

As far as the cached logProb, I believe I checked that, but I just checked again on the carnivores example, and running the gradient never changed the calcNodes cached logProb value.

paciorek avatar Dec 13 '24 17:12 paciorek