MIDASpy
MIDASpy copied to clipboard
Impute new data using trained model.
Looking at the codebase I could not locate a function where the trained model could be used to impute new data after training the model. There seems to be a couple of functions that could be utilized to perform this indirectly but I am surprised that is not included as a separate function.
Hi @mabdelhack -- you can impute data after training the model using the .generate_samples()
function which saves m imputed datasets to the .output_list attribute.
If you are referring to entirely new data (i.e. a completely separate test dataset), we do not currently have this functionality. For the purposes of imputation, we prefer to use denoising and dropout as a means of regularization over conventional test-train splits.
We can consider this as an extension, and I'd be interested to know in what imputation circumstances this might be useful?
I am wondering that too. it will be useful when applying the cross validation like iterativeimputer. I hope you will consider to add that function.
Thank you
Hi @muhammad92syahrul and @mabdelhack!
We still don't have a specific function for your purposes, but I wanted to flag the .change_imputation_target()
method (found here in the source code). This method allows you to fit a model on $X$ as standard, change the imputation target to some new data $X'$, then sample completed datasets from $X'$ by calling .generate_samples()
afterwards.
For cross-validation purposes, this does seem like a reasonable use case and I'll think more about supporting it more widely. But hopefully in the meantime this function may give you the functionality you require (it is, however, only very lightly tested).