category_encoders icon indicating copy to clipboard operation
category_encoders copied to clipboard

Error handling in inverse_transform is broken

Open janmotl opened this issue 6 years ago • 0 comments

Inverse_transform should ideally handle absence of the columns dropped because of drop_invariant=True. But if it is not possible, inverse_transform() should at least return the correct error message instead of just crashing.

Example in form of a parameterized unit test:

    def test_inverse_wrong_feature_count_wit_drop_invariant(self):
        x = [['A', 'B', 'C'], ['D', 'E', 'C'], ['F', 'G', 'C']]  # the last column is constant 
        for encoder_name in {'BaseNEncoder', 'BinaryEncoder', 'OrdinalEncoder', 'OneHotEncoder'}:
            with self.subTest(encoder_name=encoder_name):
                enc = getattr(encoders, encoder_name)(drop_invariant=True)
                transformed = enc.fit_transform(x)

                # run inverse_transform() and check the raised exception text
                with self.assertRaises(ValueError) as cm:
                    enc.inverse_transform(transformed)
                self.assertTrue(str(cm.exception).startswith('Unexpected input dimension'))

janmotl avatar May 07 '19 11:05 janmotl