miceforest
miceforest copied to clipboard
impute_new_data don`t work
I tried directly use impute_new_data and complete_data() after I runned the self.kds.mice on old data:
k=self.kds.impute_new_data(new_data=newdata,iterations=self.num_iterations,random_state=self.random_state,copy_data=True) temp=k.complete_data()
But the temp still had null values.
And I tried pipeline:
X_train=olddata[xcol] y_train=olddata.drop(xcol,axis=1) pipe_kernel = mf.ImputationKernel(X_train, datasets=1) pipe = Pipeline([ ('impute', pipe_kernel), ('scaler', StandardScaler()), ]) X_train_t = pipe.fit_transform( X_train, y_train, impute__iterations=self.num_iterations, impute__train_nonmissing=True ) # Transform the test data as well X_test = pipe.transform(newdata)
It reported IndexError: list index out of range
File "C:\Users\l\anaconda3\envs\ML\lib\site-packages\sklearn\base.py", line 1151, in wrapper return fit_method(estimator, *args, **kwargs) File "C:\Users\l\anaconda3\envs\ML\lib\site-packages\sklearn\pipeline.py", line 464, in fit_transform Xt = self._fit(X, y, **fit_params_steps) File "C:\Users\l\anaconda3\envs\ML\lib\site-packages\sklearn\pipeline.py", line 370, in _fit X, fitted_transformer = fit_transform_one_cached( File "C:\Users\l\anaconda3\envs\ML\lib\site-packages\joblib\memory.py", line 353, in call return self.func(*args, **kwargs) File "C:\Users\l\anaconda3\envs\ML\lib\site-packages\sklearn\pipeline.py", line 952, in _fit_transform_one res = transformer.fit(X, y, **fit_params).transform(X) File "C:\Users\l\anaconda3\envs\ML\lib\site-packages\miceforest\ImputationKernel.py", line 1219, in transform new_dat = self.impute_new_data(X, datasets=[0]) File "C:\Users\l\anaconda3\envs\ML\lib\site-packages\miceforest\ImputationKernel.py", line 1589, in impute_new_data name=f"ind {str(iter_pairs[0][1])}-{str(iter_pairs[-1][1])}", IndexError: list index out of range
I'm sure the number of columns is the same in both newdata and olddata. I can't figure out what the problem is here. I would be grateful if anyone could answer my question.
Finally, I gave up on using self.kds and used a new instance of mf.ImputationKernel. It turned out to work. But I don`t know why the second bug appeared.
Got the same issue...
This might have been caused by not resetting the index on the data that was being imputed. There are assertions in major version 6 to keep these bugs from happening. If it keeps happening, please reopen this issue.