category_encoders icon indicating copy to clipboard operation
category_encoders copied to clipboard

Fix bad WOE scores.

Open slundberg opened this issue 3 years ago • 3 comments

I found that the WOEEncoder was not giving the right scores, and it was because of the the agg stats not being correct from pandas. The stats did work if we grouped by a numpy array instead of a pandas col though. So that's the fix offered here.

Fixes #

Proposed Changes

slundberg avatar Jun 01 '21 17:06 slundberg

Hi @slundberg could you please provide an example of the error? The example given in the doctest seems to work.
I'm a little concerned to change this here since for example the M-Estimate (https://github.com/scikit-learn-contrib/category_encoders/blob/master/category_encoders/m_estimate.py#L259) uses a very similar line. Is this broken/incorrect in your case as well?

PaulWestenthanner avatar Oct 08 '21 16:10 PaulWestenthanner

Is this the same problem as in #280? If so, there are other estimators that need a similar fix.

bmreiniger avatar Oct 18 '21 02:10 bmreiniger

This issue should be fixed by aligning the indices as merged in #320 @slundberg Could you please confirm?

PaulWestenthanner avatar Nov 03 '21 17:11 PaulWestenthanner