rMIDAS icon indicating copy to clipboard operation
rMIDAS copied to clipboard

Conditional Inference for New Data

Open melondonkey opened this issue 3 years ago • 1 comments

Is there a way to use the trained models to do conditional inference on new observations and also get the underlying probabilities rather than sampled data sets? For example, I train on a binary matrix of diagnoses and then as a new patient comes in, I can input their known conditions and get the probability they have the other conditions?

The ability to do that in combination with the TF API would make this a very powerful "auto-complete" model.

melondonkey avatar May 06 '21 16:05 melondonkey

Thanks for raising this issue!

It is possible to recover the predicted probabilities (rather than labels) by setting cat_coalesce = FALSE and bin_label = FALSE in the complete() function. Since uncertainty over the predictions is handled via multiply imputing the data, the best strategy would then be to average across M completed datasets in order to get good estimates of the average predicted probabilities.

We are actively looking into adding a new function to predict missing values for data not used in training, which would allow you to achieve the proposed workflow above.

tsrobinson avatar May 11 '21 12:05 tsrobinson