BayesianTools icon indicating copy to clipboard operation
BayesianTools copied to clipboard

Adaptive MCMC with parallel tempering

Open AntonBiryukovUofC opened this issue 6 years ago • 9 comments

Hello Florian,

Thank you for open-sourcing such a great library that I am hoping to use for my research in theoretical seismology. I was wondering if i could ask a potentially dumb question here without the minimum example.

I am trying to solve the problem of seismic velocity tomography using MCMC. I have a fairly costly likelihood function that I developed in Fortran, and now it is not so costly anymore. I have also a wrapper function for it to call from R and get the vector of predicted data. So far I have used R package adaptMCMC to sample with the adaptive MH, and it has been successful to a certain degree of model complexity.

In certain situations I have a multimodal posterior I want to sample from, and the adaptive MH tends to "under"-explore the second / third mode. I am curious if I could use your package to carry out adaptive parallel tempered MCMC. It seems like i need to take this:

settings <- list(iterations = iter, adapt = T, DRlevels = 1, gibbsProbabilities = NULL, temperingFunction = NULL, optimize = T, message = FALSE) out <- runMCMC(bayesianSetup = bayesianSetup, sampler = "Metropolis", settings = settings)

and pass it some sort of tempering function, like the one from your example:

temperingFunction <- function(x) 5 * exp(-0.01*x) + 1

I would sincerely appreciate if you let me know whether that's a meaningful thing to do within BT, or whether there are some alternative combinations of settings / samplers that could perhaps perform as well or better (smc maybe?).

Thanks in advance,

Anton.

AntonBiryukovUofC avatar Jul 27 '17 04:07 AntonBiryukovUofC

Hi Anton,

especially with a bimodal target, I would recommend using the DEzs or DREAMzs samplers. The tempering as it is implemented at the moment won't help much, as you will eventually still get stuck in one of the modes. Note that if your fortran model is loaded as a library, you should turn on the parallelization for these samplers, this will speed up the runs.

Feel free to let me know if that improves the results.

Best, F

florianhartig avatar Jul 27 '17 21:07 florianhartig

Thanks Florian for such a quick response ! Will try in the next couple days and report from the field :+1: The fortran model is called via .Fortran(myfunction, my args) wrapped in an R function that is called in the likelihood; and would definitely prefer parallelization of the samplers rather than the Fortran forward model (at least for now).

AntonBiryukovUofC avatar Jul 27 '17 21:07 AntonBiryukovUofC

I was wondering if you could comment a little bit on the output. For example, I passed a N-by-Nparam matrix with the initial values to the sampler, and iter = 50k: iter=50000

init.val.dr <- rep(init.val,4) dim(init.val.dr) <- c(4,4) init.val.dr <- t(init.val.dr) settings = list(iterations = iter, message = TRUE,startValue = init.val.dr, consoleUpdates=100,burnin=1000) priorMCMC <- createPrior(density = priorTest,sampler = priorTestSampler)

bayesianSetup = createBayesianSetup(likelihood = likelihood, prior = priorMCMC,parallel = T) out <- runMCMC(bayesianSetup = bayesianSetup, sampler = "DREAMzs", settings = settings)

I noticed, that in this case I get a list of Nparam in the out$chain, and the length of each matrix (element of the list) is my (iter-burnin)/4, as I passed 4-row matrix. There are also two other elements in the output (out$X and out$Z), which I am not sure about. I would be really glad if you could clarify what those two are, and whether I can combine the 4 chains together to plot my marginal distributions (and if that's even allowed or not).

Also, apologies if this question is a little bit vague.

PS. Tried the adaptive non-tempered sampler on the simple test problem - -worked as a charm. Regards,

Anton.

AntonBiryukovUofC avatar Jul 30 '17 04:07 AntonBiryukovUofC

Hi Anton,

did you read sampler descriptions in the vignette? They also have references to the original publications, which will provide much more detailed info than I can provide here.

Nevertheless, the short answer is that DE / DREAM run several chains in parallel. For the zs version, the default is 3, but if you provide 4 initial values 4 chains will be created. The way we program this, the number of iterations will then be divided by 4 for each chain, so that the total number of iterations that the user specifies corresponds to the number of model evaluations.

The z matrix is the "memory" of the sampler that is used to create new proposals. You can also set it but if not provided it will be initialized from the prior.

florianhartig avatar Jul 30 '17 14:07 florianhartig

In general, I would start running the samplers with default settings. We have tried our best to make default settings sensible. Unless your likelihood is extremely different from the prior, default settings should work for all samplers.

florianhartig avatar Jul 30 '17 14:07 florianhartig

I have just implemented parallel calibration of a Fortran model using BayesianTools. I followed the instructions in the following link to compile the Fortran subroutine into an R package. An added bonus is that is now runs in 64-bit R.

The Need for Speed Part 1: Building an R Package with Fortran (or C)

Parallelisation works using the library(parallel) method where you create an R cluster and run the independent chains in separate sessions and then combine the results. Internal BT parallelisation also seems to work but doesn't use many cores for my model. It would be useful if BT provided some parallelisation diagnostics. Thanks!

woodwards avatar Apr 08 '19 00:04 woodwards

Huh, for some reason the library(parallel) method is no faster than not running it non-parallel, even thought it uses 48% of the CPU whereas non-parallel uses only 18%. My Fortran model is compiled in the BASGRA package.

n_cluster <- 3
bt_cluster <- makeCluster(n_cluster)
clusterEvalQ(bt_cluster, library(BayesianTools))
clusterEvalQ(bt_cluster, library(BASGRA))
clusterExport(bt_cluster, globals)
    bt_setup <- createBayesianSetup(likelihood=bt_likelihood, 
                                prior=bt_prior, 
                                parallel=FALSE,
                                parallelOptions=paropts,
                                names=bt_names)
    bt_settings <- list(iterations=30000, 
                    nrChains=1, # use external chains
                    startValue=3, 
                    burnin=nBurnin/nChains+nChains, # to give correct number of samples
                    parallel=FALSE, 
                    consoleUpdates=1000,
                    message=TRUE)
    bt_out <- parLapply(cl=bt_cluster,
                        X=1:n_cluster,
                        fun = function(X, bt_setup, bt_settings)
                          runMCMC(bt_setup,
                                  bt_settings,
                                  sampler="DREAMzs"),
                        bt_setup, bt_settings)
    bt_out <- createMcmcSamplerList(bt_out)

woodwards avatar Apr 09 '19 03:04 woodwards

Hi Simon,

many thanks for your hints and comments. I have created a new issue for working on the parallelisation. I realize it's a bit tricky sometimes, but in a sense the BT shares this property with most parallel implementations in R, most of the problems users have reported are general problems (such as exporting to the slaves) that would occur with other parallel functions in R as well. Nevertheless, I think it would indeed be useful to give a bit more feedback, and I think diagnostics are also a good idea (we have to think about how we can make this save / general).

About the speedup: there is a parallelisation overhead, and that might be considerable for fast models. Moreover, it depends on the OS, and your hardware. As we note in the help, for fast models, it's probably better to only parallelise the chains. This you have to do by hand (there is an example somewhere in the vignette or the help), or just call 3 instances of R.

Best, F

florianhartig avatar Apr 09 '19 09:04 florianhartig

Thank Florian

Yes I am putting an entire chain on each of 3 nodes (2 minutes run) but the total runtime is still 6 minutes. Each chain has 3 internal chains in DREAMzs. Possibly I have not set them up right and they are sharing some bottleneck resources somewhere.

S

Edit: I solved the problem! Turns it was actually working correctly, but I was timing it incorrectly, by adding the times on each node instead of taking the max!!! :,D

woodwards avatar Apr 09 '19 20:04 woodwards