scikit-uplift icon indicating copy to clipboard operation
scikit-uplift copied to clipboard

Inverted Uplift scores of revert label (CLassTransformation) ?

Open CoteDave opened this issue 3 years ago • 4 comments

🐛 Bug

scikit-uplift==0.5.1

Hi I just compared the uplift scores of the ClassTransformation with other uplift strategies (SoloModel(slearner), xlearner) and the uplift scores of the class transformation seems very off !

As the ClassTransformation transform the target before the training, should it transform back the predicted uplift scores in the output ? How to interpret the scores compared to the other strategies ? Is this a bug ?

image

And if we draw the qini curves, we clearly see that the uplift scores of the revert label (ClassTransformation) seem inverted: image

Nothing special in the .fit(), X, Y, T, Estimator are the same for the SoloModel() and the ClassTransformation()

CoteDave avatar Oct 13 '22 15:10 CoteDave

@CoteDave Hello!

Thanks a lot for providing information. A very strange problem.

Could you provide the code and data to reproduce this bug?

maks-sh avatar Oct 19 '22 09:10 maks-sh

Hi Maksim,

here is the code (As you can see, pretty simple out of the box and exactly the same data is used for SoloModel and ClassTransformation but the predicted uplift scores for ClassTransformation seems distributed weirdly). The other 2 models are SLearner and XLearner from the Econml library.

image

Unfortunately, I can't share the data (enterprise). Some facts about the set up:

  • X shape: (557622, 136) + Categorical features are target encoded with category_encoders library. No scaling as the model is a gradient boosting (CatBoost)
  • Y (557622, 1): Binary target, with only 2,3763% of 1
  • T (557622, 1): Binary treatment with only 4.6662% of 1
  • propensity_model1 = CatBoostClassifier(n_estimators=1000, max_depth=6, learning_rate = 0.08, silent = True, early_stopping_rounds = 6)

And that's it!

CoteDave avatar Oct 19 '22 13:10 CoteDave

Thanks a lot!

We will try to reproduce the bug, and then we will return with the results 🙌

maks-sh avatar Oct 19 '22 19:10 maks-sh

Hi, I found the problem.

The Class Transformation is only made for balanced T0, T1 datasets. My dataset is highly skewed (fewer T1). Soo I can't use the class transformation algorithms as the first assumption is that the T1 and T0 are balanced.

DaveCoteDS avatar Dec 09 '22 21:12 DaveCoteDS