mice
mice copied to clipboard
Conditional PMM routine that excludes (a vector of) observed values from the donor pool
Might be interesting to include since it comes up as a request quite often.
What
mice.impute.pmm.exclude
excludes observed values or a vector of observed values from matching. Hence, these values are not imputed, but still have a role in imputation.
Why
Sometimes users want to exclude certain observations from ending up in the imputations, without excluding them from the imputation procedure altogether. With mice.impute.pmm.exclude
these observed values can still serve as predictor values.
Some tests
# to install this
# devtools::install_github(repo = "gerkovink/mice@pmm999")
library(mice)
# TEST 1
# impute without exclude
imp <- mice(nhanes,
seed = 123,
printFlag = FALSE)
A <- imp$imp$chl
# impute with exclude
meth <- make.method(nhanes)
meth["chl"] <- "pmm.exclude"
imp <- mice(nhanes, meth = meth, exclude = c(218, 187),
seed = 123,
printFlag = FALSE)
B <- imp$imp$chl
any(A == 187 | A == 218) # May be TRUE
#> [1] TRUE
any(B == 187 | B == 218) # Must be FALSE
#> [1] FALSE
# TEST 2 - copied from mice.impute.pmm
set.seed(53177)
xname <- c("age", "hgt", "wgt")
r <- stats::complete.cases(boys[, xname])
x <- boys[r, xname]
y <- boys[r, "tv"]
ry <- !is.na(y)
# Impute missing tv data with original pmm
set.seed(123); yimp.pmm <- mice.impute.pmm(y, ry, x)
set.seed(123); yimp <- mice.impute.pmm.exclude(y, ry, x)
identical(yimp, yimp.pmm) #should be TRUE
#> [1] TRUE
set.seed(123); yimp.pmm <- mice.impute.pmm(y, ry, x)
set.seed(123); yimp <- mice.impute.pmm.exclude(y, ry, x, exclude = c(20, 25))
identical(yimp, yimp.pmm) # should be FALSE
#> [1] FALSE
c(20, 25) %in% yimp # should be FALSE twice
#> [1] FALSE FALSE
Created on 2021-05-10 by the reprex package (v1.0.0)
R CMD check
── R CMD check results ────────────────────────────────── mice 3.13.7 ────
Duration: 3m 4.3s
0 errors ✓ | 0 warnings ✓ | 0 notes ✓
R CMD check succeeded
This is a useful addition. Two suggestions:
- Preferably implemented in the standard
mice.impute.pmm(..., exclude = c(...))
function to evade code duplication; - In the likely case that you want different exclusions for different variables, use the
blots
parameter to pass down differentexclude
vectors.
moving over to other branch. Closed by #519.