mice
mice copied to clipboard
ampute.discrete failing when input data set contains character/categorical variables
Describe the bug
This is not a bug, but I feel it is quite an important enhancement.
After making quite a deep debugging:
In the ampute()
function we have the following line of code data <- as.data.frame(sapply(data, as.numeric))
that takes care of converting the variables to numeric. If the variable is a character or factor (i.e treatment arm), these are converted to NA
.
While making use of ampute.discrete
, as I wanted to go over the odds
parameter, (with cont = FALSE
) the line scores <- apply(candidates, 1, function(x) weights[i, ] %*% x)
is generating a vector of NA-s even when weight to be used for this variable is 0. The following screenshot shows the resulting scores
output (length of 2 as I had two missing patterns).
Then, the ampute.discrete
is throwing the following error which is hard to interpret unless you debug deeply in the mice functions.
Error in if (scores[[i]][[1]] == 0) { : missing value where TRUE/FALSE needed
What about adding an assertion to make sure there is no character or factor variable in the input data set when trying to use ampute.discrete
?
If ampute.continuous()
the condition else if (length(unique(scores.temp)) == 1)
within this function is TRUE
as all is NA
as shown in the screenshot above. This gives the following warning, but the function does not throw an error:
warning(paste("The weighted sum scores of all candidates in pattern", i, "are the same, they will be amputed with probability", prop), call. = FALSE)
probs <- prop
My suggestion would be to prevent the user from adding non-numerical variables to the mice::ampute function.
@stefvanbuuren
Thank you for working on this amazing package!