mice icon indicating copy to clipboard operation
mice copied to clipboard

ampute.discrete failing when input data set contains character/categorical variables

Open imazubi opened this issue 6 months ago • 0 comments

Describe the bug

This is not a bug, but I feel it is quite an important enhancement.

After making quite a deep debugging:

In the ampute() function we have the following line of code data <- as.data.frame(sapply(data, as.numeric)) that takes care of converting the variables to numeric. If the variable is a character or factor (i.e treatment arm), these are converted to NA.

While making use of ampute.discrete, as I wanted to go over the odds parameter, (with cont = FALSE) the line scores <- apply(candidates, 1, function(x) weights[i, ] %*% x) is generating a vector of NA-s even when weight to be used for this variable is 0. The following screenshot shows the resulting scores output (length of 2 as I had two missing patterns).

image

Then, the ampute.discrete is throwing the following error which is hard to interpret unless you debug deeply in the mice functions.

Error in if (scores[[i]][[1]] == 0) { : missing value where TRUE/FALSE needed

What about adding an assertion to make sure there is no character or factor variable in the input data set when trying to use ampute.discrete?

If ampute.continuous() the condition else if (length(unique(scores.temp)) == 1) within this function is TRUE as all is NA as shown in the screenshot above. This gives the following warning, but the function does not throw an error:

warning(paste("The weighted sum scores of all candidates in pattern", i, "are the same, they will be amputed with probability", prop), call. = FALSE)
        probs <- prop

My suggestion would be to prevent the user from adding non-numerical variables to the mice::ampute function.

@stefvanbuuren

Thank you for working on this amazing package!

imazubi avatar Dec 28 '23 10:12 imazubi