feature_engine
feature_engine copied to clipboard
Ordinal encoder outputs -1 for unknown Categories
This is my shot for #428.
I noticed OrdinalEncoder
inherits the transform
method from CategoricalMethodsMixin
. So I added an additional condition to output -1 only for OrdinalEncoder
.
Additionally I had to fix test_error_if_input_df_contains_categories_not_present_in_training_df
in order for raise an error in case errors='raise'
and check correctness if errors='ignore'
.
I'm not sure what happened with this branch. My fork is up to date but for some reason it accounts changes in the previous merged commit. I will check what happened here...
FYI #502
Hey @solegalli, I don't understand very well if this is already implemented or are there some things to be implemented here? Any way, I would like to keep contributing to Feature Engine so if I can help in this PR or any other I would be happy to help.
Hi @datacubeR
In this PR, we want the OrdinalEncoder to output -1 for unseen categories. It is not implemented yet.
We did something similar for the CountFrequencyEncoder, where we made it output 0 for unseen categories.
My suggestion was that you used the implementation in CountFrequencyEncoder as template for this PR.
Also, since this PR was first made, we made a few structural changes to the main source code, so that imports changed slightly. So it needs rebasing.
It would be super useful if you could finish this PR first, which should include very few code changes. And then I would be more than happy to tag you in a new PR :)
Thanks a lot!
functionality added in #539