replace concat in encoders by y_train.groupby(X_train["variable"]).mean().sort_values()

Open solegalli opened this issue 3 years ago • 1 comments

Applies to the following encoders:

OrdinalEncoder
MeanEncoder
PRatioEncoder
WoEEncoder

Currently we concatenate the target to the predictors (X_train) to determine the mappings. This is not necessary. We can determined them automatically with syntax like this:

y_train.groupby(X_train["A7"]).mean().sort_values()

therefore removing one unnecessary step in the calculation, the pd.concat step

Apr 25 '22 15:04 solegalli

@noahjgreen295 @bmreiniger would this be an interesting issue to work on?

Apr 25 '22 17:04 solegalli