don't allow same seed for multiple chains in `runMCMC`?
Based on a user question, we've seen that using different initial values but same seed for chains where the model is fully conjugate results in identical samples (after a very short phase of alignment in the initial samples).
Perhaps in runMCMC we should not allow use of a single seed (or we should use that seed to generate multiple seeds to use in starting the chains.
@paciorek @perrydv Of course I welcome discussion on this. But I feel somewhat strongly that one should be able to have this behaviour as currently provided: provide setSeed = NUMBER to runMCMC, and then have the NUMBER provided be used as the starting seed for each chain. This is useful and important. I would not want to modify this such that runMCMC then starts creating (in some opaque manner) it's own random seeds for each chain. The current behaviour also supports users choosing a unique seed for each chain, via the usage setSeed = c(NUM1, NUM2, ....,).
A few alternatives I can imagine:
- Disallow providing a single number for the
setSeedargument, and if a user truly wants the same starting seed for each chain, they would then have to specify:setSeed = rep(NUM, nchains), or - In the usage of providing a single number for the
setSeedargument, assetSeed = NUMBER, we then haverunMCMCissue a Note that in the case of fully conjugate models, they'll get the same samples, or, even more sophisticated, - In the usage of providing a single number for the
setSeedargument, assetSeed = NUMBER,runMCMCinspects themcmcobject, and if all samplers areconjugate, then issues a Warning.
Thoughts?
Option 1 seems somewhat reasonable.
But I wonder if maybe what we should do is that if a user provides a single number that that is the seed for the first chain and then the other chains use the random number sequence as it is, without setting a seed at all.
@paciorek I can understand the idea of "if a user provides a single number that that is the seed for the first chain and then the other chains use the random number sequence as it is, without setting a seed at all.", except the fundamental inconsistency this would introduce with the inits argument of runMCMC, where if a single list of initial values is provided for inits, then this list of initial values is used for each chain. If the behaviour of inits were to be made consistent with your suggestion, then this single list of initial values would be used to initialize the first chain only, and then subsequent chains would begin where the preceding chain ended. However I try to rationalize it, I don't like the idea of introducing this glaring inconsistency between the behaviour of the inits and setSeed arguments.
(suggestion 1 again) Maybe just disallow the use of a single number for setSeed, as setSeed = NUMBER? And, if a user wants a seed to be set at the beginning of the first chain only (and not set thereafter), they can use:
set.seed(NUMBER)
runMCMC(...)
That said, I personally still think it's a useful case to provide a single numeric seed, which is used as the starting seed for each chain. Admitting also, this case can be covered via setSeed = rep(NUMBER, nchains). So maybe suggestion (1) is the way to go?
Good point -- I hadn't thought about that inconsistency. I think we might think about the seed differently as often people run code without setting it and just leave things to pick up where the generator has left off. But I agree it feels a bit odd.
So I'm happy with the idea of requiring as many seeds as chains if the user is providing a numeric value.
@paciorek Quick attempt at a fix is in PR #1495.
Let's see what testing turns up.