mice icon indicating copy to clipboard operation
mice copied to clipboard

Multilevel imputation does not accept character or factor variable as the cluster variable; must be integer

Open isaactpetersen opened this issue 6 months ago • 1 comments

Multilevel imputation does not appear to accept a character or factor variable as the cluster variable. It appears that the cluster variable must be integer. Note, when using 2l.pmm/miceadds, I receive the same error as documented in the MICE discussion here, so the reproducible example below could potentially explain why those users were experiencing the issue.

Here is a reprex (adapted from the MICE vignette here):

library("mice")
#> 
#> Attaching package: 'mice'
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following objects are masked from 'package:base':
#> 
#>     cbind, rbind
library("miceadds")
#> * miceadds 3.17-44 (2024-01-08 19:08:24)

# D
con <- url("https://www.gerkovink.com/mimp/popular.RData")
load(con)

dataToImpute <- popNCR2

# Specify variables to impute
Y <- "popular"

# Imputation method
meth <- make.method(dataToImpute)
meth[1:length(meth)] <- ""

# Specify predictor matrix
pred <- make.predictorMatrix(dataToImpute)
pred[1:nrow(pred), 1:ncol(pred)] <- 0
pred[Y, "class"] <- (-2) # cluster variable
pred[Y, "extrav"] <- 1 # fixed effect predictor
diag(pred) <- 0

pred
#>          pupil class extrav sex texp popular popteach
#> pupil        0     0      0   0    0       0        0
#> class        0     0      0   0    0       0        0
#> extrav       0     0      0   0    0       0        0
#> sex          0     0      0   0    0       0        0
#> texp         0     0      0   0    0       0        0
#> popular      0    -2      1   0    0       0        0
#> popteach     0     0      0   0    0       0        0

# Character
dataToImpute$class <- as.character(dataToImpute$class)

meth[Y] <- "2l.norm"
imp1 <- mice(dataToImpute, pred = pred, meth = meth, maxit = 5, print = FALSE)
#> Error in mice.impute.2l.norm(y = c(6.3, 4.9, 5.3, 4.7, 4.5, 4.7, 5.9, : No class variable

meth[Y] <- "2l.pmm"
imp2 <- mice(dataToImpute, pred = pred, meth = meth, maxit = 5, print = FALSE)
#> Error in str2lang(x): <text>:1:24: unexpected ')'
#> 1: dv._lmer ~ 1+extrav+(1|)
#>                            ^

# Factor
dataToImpute$class <- as.factor(dataToImpute$class)

meth[Y] <- "2l.norm"
imp3 <- mice(dataToImpute, pred = pred, meth = meth, maxit = 5, print = FALSE)
#> Error in check.cluster(data, predictorMatrix): Convert cluster variable class to integer by as.integer()

meth[Y] <- "2l.pmm"
imp4 <- mice(dataToImpute, pred = pred, meth = meth, maxit = 5, print = FALSE)
#> Error in check.cluster(data, predictorMatrix): Convert cluster variable class to integer by as.integer()

# Integer
dataToImpute$class <- as.integer(dataToImpute$class)

meth[Y] <- "2l.norm"
imp5 <- mice(dataToImpute, pred = pred, meth = meth, maxit = 5, print = FALSE)

meth[Y] <- "2l.pmm"
imp6 <- mice(dataToImpute, pred = pred, meth = meth, maxit = 5, print = FALSE)

sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 11 x64 (build 22631)
#> 
#> Matrix products: default
#> 
#> 
#> Random number generation:
#>  RNG:     Mersenne-Twister 
#>  Normal:  Inversion 
#>  Sample:  Rounding 
#>  
#> locale:
#> [1] LC_COLLATE=English_United States.utf8 
#> [2] LC_CTYPE=English_United States.utf8   
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.utf8    
#> 
#> time zone: America/Chicago
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] miceadds_3.17-44 mice_3.16.0     
#> 
#> loaded via a namespace (and not attached):
#>  [1] utf8_1.2.4        generics_0.1.3    tidyr_1.3.1       shape_1.4.6.1    
#>  [5] lattice_0.22-6    lme4_1.1-35.5     digest_0.6.36     magrittr_2.0.3   
#>  [9] mitml_0.4-5       evaluate_0.24.0   grid_4.3.1        iterators_1.0.14 
#> [13] fastmap_1.2.0     foreach_1.5.2     jomo_2.7-6        glmnet_4.1-8     
#> [17] Matrix_1.6-5      nnet_7.3-19       backports_1.5.0   DBI_1.2.3        
#> [21] survival_3.7-0    purrr_1.0.2       fansi_1.0.6       codetools_0.2-20 
#> [25] cli_3.6.3         mitools_2.4       rlang_1.1.4       splines_4.3.1    
#> [29] reprex_2.1.1      withr_3.0.0       yaml_2.3.10       pan_1.9          
#> [33] tools_4.3.1       nloptr_2.1.1      minqa_1.2.7       dplyr_1.1.4      
#> [37] boot_1.3-30       broom_1.0.6       vctrs_0.6.5       R6_2.5.1         
#> [41] rpart_4.1.23      lifecycle_1.0.4   fs_1.6.4          MASS_7.3-60.0.1  
#> [45] pkgconfig_2.0.3   pillar_1.9.0      glue_1.7.0        Rcpp_1.0.13      
#> [49] xfun_0.46         tibble_3.2.1      tidyselect_1.2.1  rstudioapi_0.16.0
#> [53] knitr_1.48        htmltools_0.5.8.1 nlme_3.1-165      rmarkdown_2.27   
#> [57] compiler_4.3.1

Created on 2024-07-31 with reprex v2.1.1

isaactpetersen avatar Jul 31 '24 16:07 isaactpetersen