scikit-uplift
scikit-uplift copied to clipboard
Inverted Uplift scores of revert label (CLassTransformation) ?
🐛 Bug
scikit-uplift==0.5.1
Hi I just compared the uplift scores of the ClassTransformation with other uplift strategies (SoloModel(slearner), xlearner) and the uplift scores of the class transformation seems very off !
As the ClassTransformation transform the target before the training, should it transform back the predicted uplift scores in the output ? How to interpret the scores compared to the other strategies ? Is this a bug ?

And if we draw the qini curves, we clearly see that the uplift scores of the revert label (ClassTransformation) seem inverted:

Nothing special in the .fit(), X, Y, T, Estimator are the same for the SoloModel() and the ClassTransformation()
@CoteDave Hello!
Thanks a lot for providing information. A very strange problem.
Could you provide the code and data to reproduce this bug?
Hi Maksim,
here is the code (As you can see, pretty simple out of the box and exactly the same data is used for SoloModel and ClassTransformation but the predicted uplift scores for ClassTransformation seems distributed weirdly). The other 2 models are SLearner and XLearner from the Econml library.

Unfortunately, I can't share the data (enterprise). Some facts about the set up:
- X shape: (557622, 136) + Categorical features are target encoded with category_encoders library. No scaling as the model is a gradient boosting (CatBoost)
- Y (557622, 1): Binary target, with only 2,3763% of 1
- T (557622, 1): Binary treatment with only 4.6662% of 1
- propensity_model1 = CatBoostClassifier(n_estimators=1000, max_depth=6, learning_rate = 0.08, silent = True, early_stopping_rounds = 6)
And that's it!
Thanks a lot!
We will try to reproduce the bug, and then we will return with the results 🙌
Hi, I found the problem.
The Class Transformation is only made for balanced T0, T1 datasets. My dataset is highly skewed (fewer T1). Soo I can't use the class transformation algorithms as the first assumption is that the T1 and T0 are balanced.