example-models icon indicating copy to clipboard operation
example-models copied to clipboard

Soil model case study: divergences, poor mixing Rhat >> 1.

Open jonathan-g opened this issue 8 years ago • 3 comments

I noticed that the version of the soil model at the mc-stan Case Studies page shows some inconsistencies: For the measurement error model, the stan call and the output show 2 chains with 100 samples each (50 warmup, 50 post-warmup) but the output from print(fit_me) says that the sampling was done with 4 chains and 500 post-warmup samples each.

More seriously, the traceplots show very poor mixing, but print(fit_me) shows Rhat <= 1.01, which is inconsistent with the traceplots.

So I cloned it from github and ran it, and indeed, I see lots of divergences in the pairs plot and most of the Rhat numbers for fit_me are >> 1 (many > 10).

If I increase adapt_delta to 0.95, decrease stepsize to 0.002, and allow 1000 warmup iterations before the 200 sampling iterations, the Rhat numbers reduce significantly (<= 1.4 for the first model and <= 1.1 for the second) but there are still over 30 divergent transitions in both models, and the traceplots don't look like they're mixing very well.

The pairs plot for the first model looks like there is some serious funnel action happening, most obviously between k1 and alpha21, but also between other pairs of parameters.

Maybe this is well known to you. I understand that the example-models repository is a work in progress, so please don't think I'm complaining. Just wanting to let you know in case you hadn't realized this.

> sessionInfo()
R version 3.2.4 (2016-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] SoilR_1.1-23  RUnit_0.4.31  deSolve_1.13  rstan_2.9.0-3 ggplot2_2.1.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.4       grid_3.2.4        plyr_1.8.3        gtable_0.2.0      stats4_3.2.4     
 [6] scales_0.4.0      tools_3.2.4       munsell_0.4.3     rsconnect_0.4.2.1 inline_0.3.14    
[11] colorspace_1.2-6  gridExtra_2.2.1  

jonathan-g avatar Apr 16 '16 06:04 jonathan-g

Thanks for pointing it out. I can try it again. I don't remember having trouble fitting it before.

I don't know how I managed to get inconsistencies in knitr output, but I'll check it out. Maybe something to do with caching?

These models should all fit much much better in Stan 2.10 when we bring in the CVODES ODE solvers. I plan to go and do a much more thorough job of these models and implement some of the standard multi-compartment models.

  • Bob

On Apr 16, 2016, at 2:06 AM, Jonathan Gilligan [email protected] wrote:

I noticed that the version of the soil model at the mc-stan Case Studies page shows some inconsistencies: For the measurement error model, the stan call and the output show 2 chains with 100 samples each (50 warmup, 50 post-warmup) but the output from print(fit_me) says that the sampling was done with 4 chains and 500 post-warmup samples each.

More seriously, the traceplots show very poor mixing, but print(fit_me) shows Rhat <= 1.01, which is inconsistent with the traceplots.

So I cloned it from github and ran it, and indeed, I see lots of divergences in the pairs plot and most of the Rhat numbers for fit_me are >> 1 (many > 10).

If I increase adapt_delta to 0.95, decrease stepsize to 0.002, and allow 1000 warmup iterations before the 200 sampling iterations, the Rhat numbers reduce significantly (<= 1.4 for the first model and <= 1.1 for the second) but there are still over 30 divergent transitions in both models, and the traceplots don't look like they're mixing very well.

The pairs plot for the first model looks like there is some serious funnel action happening, most obviously between k1 and alpha21, but also between other pairs of parameters.

Maybe this is well known to you. I understand that the example-models repository is a work in progress, so please don't think I'm complaining. Just wanting to let you know in case you hadn't realized this.

sessionInfo() R version 3.2.4 (2016-03-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] SoilR_1.1-23 RUnit_0.4.31 deSolve_1.13 rstan_2.9.0-3 ggplot2_2.1.0

loaded via a namespace (and not attached): [1] Rcpp_0.12.4 grid_3.2.4 plyr_1.8.3 gtable_0.2.0 stats4_3.2.4
[6] scales_0.4.0 tools_3.2.4 munsell_0.4.3 rsconnect_0.4.2.1 inline_0.3.14
[11] colorspace_1.2-6 gridExtra_2.2.1

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub

bob-carpenter avatar Apr 16 '16 17:04 bob-carpenter

I'm thinking that the inconsistency is almost certainly a caching issue. Always a good thing to do the final run of a knitr document either setting cache=FALSE for all the chunks or else after deleting the cache folder.

Knitr does not automatically determine dependencies between chunks, so to get the cache mechanism to work, you need to manually specify dependencies in the chunk options.

jonathan-g avatar Apr 16 '16 18:04 jonathan-g

BTW, I really appreciate these case studies. The reason I caught this thing is that I'm reading through them very carefully to try to learn best practices. Your baseball example is super-helpful about how to recognize and deal with problems of divergence and regions of high curvature in parameter space.

jonathan-g avatar Apr 16 '16 18:04 jonathan-g