cmdstanr
cmdstanr copied to clipboard
Use cmdstan for parallel chains
As recently flagged in #888, it can be confusing that we use different approaches for parallel chains (multiple processx calls) than for parallel pathfinders (cmdstan TBB).
Now that all MCMC methods in Cmd/Stan support parallel chains via TBB, we should update
Yes, good idea!
I just tried hacking this and there's quite a few big changes. I'm doc'ing them here because I think some changes in cmdstan and cmdstanr would make this a lot easier.
- Let cmdstan take comma seperated lists for file names of inits, outputs, and chain ids
- The current calling methods in cmdstan for multiple chains were meant to be shorthand for someone calling cmdstan from the command line themselves. But those shorthands turn into kind of weird logic when calling things from a higher level language. Letting the arguments be lists would mean we don't need to code around that logic
-
cmdstan needs to allow generated quantities to use parallel chains, there's a PR for this in https://github.com/stan-dev/cmdstan/pull/1256
-
Turning on threading by default is good for everyone but terrible for windows users without WSL. For users with windows machines there is a very big known performance bug with mingw's
thread_localimplementation in C++. Note this is true for any model that uses threads i.e.reduce_sumandmap_rectmodels as well. For mac, wsl, and linux we can have threads on by default, but for windows we either need to stop using mingw, still keep the ability to have multiple processes running a single chain each, or make windows users have that performance penalty. I do wonder what the size is of windows users that cannot get WSL on their computers.
With all that said, honestly I wonder whether doing this in cmdstanr is a good idea vs. working on what @WardBrian put together that allows a FFI interface to Stan models directly from R and Python would be better
Thanks @SteveBronder. That does sound like perhaps it's more trouble than it's worth. Does CmdStanPy have some way around these problems? I think they switched over to using cmdstan for multiple chains, so I'm wondering how they got around these issues or if they just bit the bullet. @rok-cesnovar or @andrjohns, any thoughts on the Windows aspect of this (#3 in Steve's list).
We support both options still. If the model was compiled with STAN_THREADS we opt in to using the built-in multichain, otherwise we use the older style of spawning multiple processes.
I am +1 on what cmdstanpy does.