Ch2 page 58 dropping error"['income_cat'] not found in axis"
pd.options.mode.chained_assignment = None
for set_ in (strat_train_set, strat_test_set):
set_.drop("income_cat", axis=1, inplace=True)
got the following error..
KeyError Traceback (most recent call last)
<ipython-input-87-1e1581743ea6> in <module>
1 pd.options.mode.chained_assignment = None
2 for set_ in (strat_train_set, strat_test_set):
----> 3 set_.drop("income_cat", axis=1, inplace=True)
~\Anaconda3\lib\site-packages\pandas\core\frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
3695 index=index, columns=columns,
3696 level=level, inplace=inplace,
-> 3697 errors=errors)
3698
3699 @rewrite_axis_style_signature('mapper', [('copy', True),
~\Anaconda3\lib\site-packages\pandas\core\generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
3109 for axis, labels in axes.items():
3110 if labels is not None:
-> 3111 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
3112
3113 if inplace:
~\Anaconda3\lib\site-packages\pandas\core\generic.py in _drop_axis(self, labels, axis, level, errors)
3141 new_axis = axis.drop(labels, level=level, errors=errors)
3142 else:
-> 3143 new_axis = axis.drop(labels, errors=errors)
3144 result = self.reindex(**{axis_name: new_axis})
3145
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in drop(self, labels, errors)
4402 if errors != 'ignore':
4403 raise KeyError(
-> 4404 '{} not found in axis'.format(labels[mask]))
4405 indexer = indexer[~mask]
4406 return self.delete(indexer)
KeyError: "['income_cat'] not found in axis"
I checked both the sets with head() method and I have the income_cat column in them but still I am getting the error. please help.!!
Hi, thanks for your question,
did you run the cell that creates the income_cat column?
housing["income_cat"] = pd.cut(housing["median_income"],
bins=[0., 1.5, 3.0, 4.5, 6., np.inf],
labels=[1, 2, 3, 4, 5])
Hi @biswatig ,
Any feedback regarding this issue? Perhaps there's a space in the name of the column? Do you have the latest pandas version? Could you please ensure that strat_train_set and strat_test_set are not actually the same Dataframe? This would explain why you see the column in "both" Dataframes, but the loop would first remove the income_cat column from the dataset then it would try again and fail since it's already been deleted.
If nothing works, I recommend ensuring that you have the latest versions of pandas, scikit-learn, numpy and other libraries, and reverting to the exact same version of the notebook as in the project, without any modifications, then restart the Jupyter kernel and run the notebook cell by cell, sequentially. Hopefully this would work.
I got the same error because I ran the code twice. Seems like the column was deleted on the first try. And if we run it again, it shows errors.
Thanks @prabinlamsal19 , that's probably what happened, indeed. @biswatig , could you please confirm that the issue is resolved?