mice icon indicating copy to clipboard operation
mice copied to clipboard

Environment variable `state` breaks mice()

Open mculbert opened this issue 1 year ago • 1 comments

A somewhat obscure error—if:

  1. The data frame passed to mice() contains a variable called state that consists of either: (a) a character vector (of potentially different values) or (b) only a single (repeated) value (of any type), AND
  2. There is a variable called state available in the environment (either the global environment or an attached data frame),

then mice() throws the error:

Error in s$it : $ operator is invalid for atomic vectors

Examples:

library(mice) # version 3.15.0
mynhanes <- mice::nhanes
state <- "zen"

mynhanes$state <- rnorm(25)
imp <- mice(mynhanes)  # No error

mynhanes$state <- sample(c("WA", "OR", "CA"), 25, replace=T)
imp <- mice(mynhanes)  # Error

mynhanes$state <- 3.1415
imp <- mice(mynhanes)  # Error

rm(state)
imp <- mice(mynhanes)  # No error (warning about logged events)

attach(mynhanes)
imp <- mice(mynhanes)  # Error

The error is coming from here: https://github.com/amices/mice/blob/3e3e3ca0fa53f1b90fb7142bedf36375d5282e90/R/internal.R#L107 because the call to ma_exists("state", ...) on either line 100 or 103 is apparently accessing the wrong variable in the environment through some kind of iterated search of parent environments here: https://github.com/amices/mice/blob/3e3e3ca0fa53f1b90fb7142bedf36375d5282e90/R/internal.R#L140

The intended state variable (wherever it comes from) should perhaps be encapsulated a little more explicitly in a mice-specific data structure, rather than doing an open search of the environment. But, as I'm not familiar with mice()'s innards, I'm not sure what the best fix would be. Maybe it's as simple as renaming state to something a little less generic, like mice_internal_state_ so there is less likely to be a conflict with user variable names.

mculbert avatar Dec 16 '22 18:12 mculbert