mice icon indicating copy to clipboard operation
mice copied to clipboard

Longer elapsed time with parlmice compared to mice?

Open narmerguy opened this issue 4 years ago • 2 comments

Running mice on a data frame with about 11,000 rows and 42 columns, ~12 of which have missing values ranging to 10% of the observations. After running multiple tests, I consistently have a longer time running parlmice() and 5 cores vs just mice(), and it is not clear to me why. I am wondering if this is a known behavior? I understand that in cases where the size of the job and computation time is low that parallelization can slow down overall computation, but in this case, even using cl.type = 'FORK', the speed is about half when I use multiple cores.

narmerguy avatar Dec 12 '19 07:12 narmerguy

I am getting the same problem with Windows 10 and R 3.5.x and 3.6.x later. The parlmice is 3 times or extremely slow more than the single core mice function. Observating action of parlmice processes as Rscript.exe in the task manager of Windows, the working memory was increasing sequentially but not at parallel.

norihikorihiko avatar Apr 16 '20 09:04 norihikorihiko

I cannot reproduce on Unix with 7 logical cores. I am working on a new version of parlmice. Identifying this behaviour would be useful, but I'd need a reprex in order to dive into the why of your experiences.

@narmerguy Are you on Windows? Cluster type FORK on Windows is merely a stub.

About the expected behaviour. In many situations when few (i.e. m < 25) imputed data sets are generated, the overhead in producing the parallel computing cluster is too time intensive. See the parlmice vignette

All the best,

Gerko

gerkovink avatar May 04 '20 19:05 gerkovink

We are going to retire parlmice() in favour of futuremice() available in mice 3.14.12.

stefvanbuuren avatar Nov 14 '22 15:11 stefvanbuuren