handson-ml icon indicating copy to clipboard operation
handson-ml copied to clipboard

Ch2 page 58 dropping error"['income_cat'] not found in axis"

Open biswatig opened this issue 6 years ago • 4 comments

pd.options.mode.chained_assignment = None
for set_ in (strat_train_set, strat_test_set):
    set_.drop("income_cat", axis=1, inplace=True)

got the following error..

KeyError                                  Traceback (most recent call last)
<ipython-input-87-1e1581743ea6> in <module>
      1 pd.options.mode.chained_assignment = None
      2 for set_ in (strat_train_set, strat_test_set):
----> 3     set_.drop("income_cat", axis=1, inplace=True)

~\Anaconda3\lib\site-packages\pandas\core\frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3695                                            index=index, columns=columns,
   3696                                            level=level, inplace=inplace,
-> 3697                                            errors=errors)
   3698 
   3699     @rewrite_axis_style_signature('mapper', [('copy', True),

~\Anaconda3\lib\site-packages\pandas\core\generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3109         for axis, labels in axes.items():
   3110             if labels is not None:
-> 3111                 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
   3112 
   3113         if inplace:

~\Anaconda3\lib\site-packages\pandas\core\generic.py in _drop_axis(self, labels, axis, level, errors)
   3141                 new_axis = axis.drop(labels, level=level, errors=errors)
   3142             else:
-> 3143                 new_axis = axis.drop(labels, errors=errors)
   3144             result = self.reindex(**{axis_name: new_axis})
   3145 

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in drop(self, labels, errors)
   4402             if errors != 'ignore':
   4403                 raise KeyError(
-> 4404                     '{} not found in axis'.format(labels[mask]))
   4405             indexer = indexer[~mask]
   4406         return self.delete(indexer)

KeyError: "['income_cat'] not found in axis"

I checked both the sets with head() method and I have the income_cat column in them but still I am getting the error. please help.!!

biswatig avatar Jul 05 '19 11:07 biswatig

Hi, thanks for your question, did you run the cell that creates the income_cat column?

housing["income_cat"] = pd.cut(housing["median_income"],
                               bins=[0., 1.5, 3.0, 4.5, 6., np.inf],
                               labels=[1, 2, 3, 4, 5])

ageron avatar Jul 08 '19 17:07 ageron

Hi @biswatig , Any feedback regarding this issue? Perhaps there's a space in the name of the column? Do you have the latest pandas version? Could you please ensure that strat_train_set and strat_test_set are not actually the same Dataframe? This would explain why you see the column in "both" Dataframes, but the loop would first remove the income_cat column from the dataset then it would try again and fail since it's already been deleted. If nothing works, I recommend ensuring that you have the latest versions of pandas, scikit-learn, numpy and other libraries, and reverting to the exact same version of the notebook as in the project, without any modifications, then restart the Jupyter kernel and run the notebook cell by cell, sequentially. Hopefully this would work.

ageron avatar Oct 14 '19 02:10 ageron

I got the same error because I ran the code twice. Seems like the column was deleted on the first try. And if we run it again, it shows errors.

prabinlamsal19 avatar Jan 22 '21 18:01 prabinlamsal19

Thanks @prabinlamsal19 , that's probably what happened, indeed. @biswatig , could you please confirm that the issue is resolved?

ageron avatar Mar 21 '21 21:03 ageron