recipes icon indicating copy to clipboard operation
recipes copied to clipboard

Request `step_impute_random()`

Open EmilHvitfeldt opened this issue 4 years ago • 1 comments
trafficstars

A recipe step that imputes using random values of the non-missing data. The way I see it, it is on the other side of the variance/bias tradeoff compared to step_impute_mean().

Bonus: this would work on all types of variables, not just numeric.

EmilHvitfeldt avatar Jul 08 '21 18:07 EmilHvitfeldt

I realize that some caution has to be taken for this step to work on applying the changes to the testing data set since it needs to retain the distribution of values in the training data set which can have a high cardinality for continuous data.

EmilHvitfeldt avatar Jul 09 '21 07:07 EmilHvitfeldt