category_encoders
category_encoders copied to clipboard
Fix bad WOE scores.
I found that the WOEEncoder was not giving the right scores, and it was because of the the agg stats not being correct from pandas. The stats did work if we grouped by a numpy array instead of a pandas col though. So that's the fix offered here.
Fixes #
Proposed Changes
Hi @slundberg
could you please provide an example of the error? The example given in the doctest seems to work.
I'm a little concerned to change this here since for example the M-Estimate (https://github.com/scikit-learn-contrib/category_encoders/blob/master/category_encoders/m_estimate.py#L259) uses a very similar line. Is this broken/incorrect in your case as well?
Is this the same problem as in #280? If so, there are other estimators that need a similar fix.
This issue should be fixed by aligning the indices as merged in #320 @slundberg Could you please confirm?