miceforest
miceforest copied to clipboard
[Maybe Bug] IndexError: positional indexers are out-of-bounds
Hi, thanks for your contribution firstly.
I found this error in my code below. It couldn't finish the imputation.
I'm curious if this has something to do with the amount of data being too small.
if __name__ == "__main__":
data = pd.DataFrame({"col1": [100, None, 200, None, 250, None, None, 200], "col2": [1, None, 3, 4, None, 6, 7, 8]})
kernel = mf.ImputationKernel(data=data)
kernel.mice(iterations=3)
data_imputed = kernel.complete_data()
print(data_imputed)
Here is my error:
Traceback (most recent call last):
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\pandas\core\indexing.py", line 1714, in _get_list_axis
return self.obj._take_with_is_copy(key, axis=axis)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\pandas\core\generic.py", line 4153, in _take_with_is_copy
result = self.take(indices=indices, axis=axis)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\pandas\core\generic.py", line 4133, in take
new_data = self._mgr.take(
^^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\pandas\core\internals\managers.py", line 891, in take
indexer = maybe_convert_indices(indexer, n, verify=verify)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\pandas\core\indexers\utils.py", line 282, in maybe_convert_indices
raise IndexError("indices are out-of-bounds")
IndexError: indices are out-of-bounds
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\DATA\CodeField\DataCleaning\tests\misc\test_mice.py", line 17, in <module>
kernel.mice(iterations=3)
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\miceforest\imputation_kernel.py", line 1186, in mice
imputation_values = self._mean_match_mice(
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\miceforest\imputation_kernel.py", line 971, in _mean_match_mice
imputation_values = self._mean_match_nearest_neighbors(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\miceforest\imputation_kernel.py", line 621, in _mean_match_nearest_neighbors
imp_values = candidate_values.iloc[index_choice]
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\pandas\core\indexing.py", line 1191, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\pandas\core\indexing.py", line 1743, in _getitem_axis
return self._get_list_axis(key, axis=axis)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\janze\scoop\apps\miniconda3\24.9.2-0\envs\data_cleaning\Lib\site-packages\pandas\core\indexing.py", line 1717, in _get_list_axis
raise IndexError("positional indexers are out-of-bounds") from err
IndexError: positional indexers are out-of-bounds
Here is my package version:
numpy 2.2.1
miceforest 6.0.3
pandas 2.2.3
Thank you!
Sorry to chime in but I would assume this comes from scipy's kdtree query method Namely, it says:
- i : integer or array of integers
The index of each neighbor in
self.data.iis the same shape as d. Missing neighbors are indicated withself.n.
I'm just adding this to the code: index_choice = np.clip(index_choice, 0, candidate_values.shape[0] - 1) Not a good solution but a quick one haha
Can you guys post a reproducible example - not sure why this would happen.