ergm
ergm copied to clipboard
Have `ergm()` and others automatically query the number of CPUs/cores on the system?
Say, if control.ergm(parallel=TRUE)
(or perhaps even options(ergm.parallel=TRUE)
), ergm()
and others will run getOption("Ncpus")
and/or getOption("mc.cores")
and/or detectCores()
, to autodetect the number of CPUs/cores.
Seems reasonable to me. Not sure what would be best default behavior. Having parallel=TRUE
to start an estimation on all available cores is rather unsafe (we have no idea what other cores are doing). It seems the default behavior of packages relying on parallel package is to use just 2 cores. Page ?options
says:
Options set in package parallel
These will be set when package parallel (or its namespace) is loaded if not already set.
mc.cores
:a integer giving the maximum allowed number of additional R processes allowed to be run in parallel to the current R process. Defaults to the setting of the environment variable
MC_CORES
if set. Most applications which use this assume a limit of 2 if it is unset.
Also note that these are additional R processes, so I guess the sensible maximum is detectCores() - 1
(not to mention one core should perhaps be left for the OS...).
After some poking around, detectCores()
is dangerous on clusters, because it'll report the total number of cores, not the number allocated to the job. My current thinking is that if parallel=TRUE
, it should look up mc.cores
, and if not found, issue a warning and continue without cores.
Oh, good to know. I now recall that I had some issues with detectCores()
if the job run on more than one (multicore) computing node. I ended-up dodging the problem by running more models, but only one per node...