mice
mice copied to clipboard
parlmice from within a function
It seems some strange things happen with environments when parlmice is wrapped in a function, e.g.
library(mice)
test.parlmice <- function() {
dat <- nhanes
parlmice(dat, cl.type='FORK', maxit = 5, n.core = 2, n.imp.core = 2)
}
Calling test.parlmice() results in:
Error in get(name, envir = envir) : object 'dat' not found
The environment is still broken, even after the parlmice call, e.g.
test.parlmice2 <- function(someVal = T) {
parlmice(nhanes, cl.type='FORK', maxit = 5, n.core = 2, n.imp.core = 2)
print(someVal)
}
Result:
Error in get(name, envir = envir) : object 'someVal' not found
Session info:
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
locale:
[1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C LC_TIME=en_NZ.UTF-8 LC_COLLATE=en_NZ.UTF-8 LC_MONETARY=en_NZ.UTF-8
[6] LC_MESSAGES=en_NZ.UTF-8 LC_PAPER=en_NZ.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mice_3.6.0 lattice_0.20-38
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 rstudioapi_0.10 magrittr_1.5 splines_3.6.1 MASS_7.3-51.4 tidyselect_0.2.5 R6_2.4.0 rlang_0.4.0
[9] jomo_2.6-9 minqa_1.2.4 dplyr_0.8.3 tools_3.6.1 parallel_3.6.1 nnet_7.3-12 grid_3.6.1 mitml_0.3-7
[17] broom_0.5.2 nlme_3.1-141 pan_1.6 survival_2.44-1.1 yaml_2.2.0 lme4_1.1-21 assertthat_0.2.1 tibble_2.1.3
[25] crayon_1.3.4 Matrix_1.2-17 nloptr_1.2.1 purrr_0.3.2 tidyr_0.8.3 rpart_4.1-15 glue_1.3.1 compiler_3.6.1
[33] pillar_1.4.2 generics_0.0.2 backports_1.1.4 boot_1.3-23 pkgconfig_2.0.2
To solve this, I've had to modify the parlmice
function:
envir <- environment()
envir %>% appendEnv(parent_envir)
cl <- parallel::makeCluster(n.core, type = cl.type)
parallel::clusterExport(cl,
varlist = names(ls(envir)),
envir = envir)
parallel::clusterExport(cl,
varlist = "do.call")
parallel::clusterEvalQ(cl, library(mice))
if (!is.na(cluster.seed)) {
parallel::clusterSetRNGStream(cl, cluster.seed)
}
imps <- parallel::parLapply(cl = cl, X = 1:n.core, function(x) do.call(mice, as.list(args), envir = envir))
with appendEnv <- function(e1, e2) { listE1 <- ls(e1) listE2 <- ls(e2) for (v in listE2) { if (v %in% listE1) warning(sprintf("Variable %s is in e1, too!", v)) e1[[v]] <- e2[[v]] } }
and parent_envir
my parent environment.
I'll look into it. Thanks.
All the best,
Gerko
I ran into this problem in 2021. Any advances in this direction?
I just ran into this problem as well.
@vwrobel I too have run into this problem. I am trying to call parlmice from within another function.
Just to clarify, when you say that parent_envir
is your parent environment, are you referring to parent_envir <- parent.frame()
?
I ran in the same problem under R 4.1.2 on Windows 10 and Ubuntu 20.04.3 LTS today. I found a "hack" to get around the error reported by @sam-crawley. Therefore you can implement a wrapper function like this, which uses the same argument names as inside the parlmice()
function. parlmice_wrapper()
can then be applied in every scope of an R script.
parlmice_wrapper <- function(data, m, cluster.seed, n.core, n.imp.core) {
result <- parlmice(data = data, m = m, cluster.seed = cluster.seed,
n.core = n.core, n.imp.core = n.imp.core, maxit = 5)
return(result)
}
So, in parlmice
all arguments are passed correctly to the cluster, since they have the same name as in the argument list when referring to the parent scope. The problematic part in the current source is
# make computing cluster
cl <- parallel::makeCluster(n.core, type = cl.type)
parallel::clusterExport(cl,
varlist = c(
"data", "m", "seed", "cluster.seed",
"n.core", "n.imp.core", "cl.type",
ls(parent.frame())
),
envir = environment()
)
parallel::clusterExport(cl,
varlist = "do.call"
)
parallel::clusterEvalQ(cl, library(mice))
if (!is.na(cluster.seed)) {
parallel::clusterSetRNGStream(cl, cluster.seed)
}
# generate imputations
imps <- parallel::parLapply(cl = cl, X = 1:n.core, function(x) do.call(mice, as.list(args), envir = environment()))
as suggested by @vwrobel. I think his solution could do the trick, so that you don't need a hack like mine.
We are going to retire parlmice()
in favour of futuremice()
available in mice 3.14.12
.
Please reopen if this problem persists in futuremice()
.