causalml
causalml copied to clipboard
Using Gain and Qini for cross validation
I have a question regarding the [meta_learners_with_synthetic_data] example (https://github.com/uber/causalml/blob/master/examples/meta_learners_with_synthetic_data.ipynb).
When tuning the meta-learners via cross validation,
- should the cum gain or qini auuc be used as the metric to select the appropriate hyperparameters or should other regression metrics be used?
- In the case of gain/qini should the area under the random curve be subtracted?
- Are these metrics rigid enough to handle that the ATE in each training or validation fold may be slightly different (hence we are not comparing apples-to-apples)?
I am also facing similar problems, especially because these metrics can be challenging for optimization models.
I am running hyperparameter tuning for an uplift random forest classifier, and I get AUUC scores anywhere between 0.2 and 20 something. It might well be due to overfitting, but I don't understand how absurdly divergent values can be produced by fitting the same model to the same dataset and applying the same calculation functions.
Does anyone know any best practices for hyperparameter tuning in such cases? Especially because this can be performance costly and every failed trial takes hours to be completed.
I am also facing similar problems, especially because these metrics can be challenging for optimization models.
I am running hyperparameter tuning for an uplift random forest classifier, and I get AUUC scores anywhere between 0.2 and 20 something. It might well be due to overfitting, but I don't understand how absurdly divergent values can be produced by fitting the same model to the same dataset and applying the same calculation functions.
Does anyone know any best practices for hyperparameter tuning in such cases? Especially because this can be performance costly and every failed trial takes hours to be completed.
I also am having these same issues. When I have experimental data (randomized) with four treatment arms and one control group, when I compare each treatment to the control, I have some wildly divergent values for the Qini score - sometimes just for one treatment-control comparison. Also, shouldn't the random Qini curve at least somewhat resembling a random curve with an approximate diagonal to represent the 50/50 randomness.
I'm confused as to how this seems so erratic at times. Any help is definitely appreciated in understanding this more. Thanks.