hep_ml icon indicating copy to clipboard operation
hep_ml copied to clipboard

Benchmark with independent classification model

Open jalvear2dxc opened this issue 3 years ago • 3 comments

Hello @arogozhnikov ,

In order to check the quality of the reweighting process, I have used an independent classifier, based on the Ugradient boosting class in the same dataset, following the steps of the list below:

  1. Before Reweighting: Training (using prior weights as sample weights) and scoring
  2. Reweighting
  3. After Reweighting: Training using new weights as sample weights and scoring

When comparing the results with those of the reweighter classifier (rw.gb), I find that the decrease in the Weighted AUC is much greater than the obtained with the independent classifier,

Results before reweigthing: classifier AUC = 0.99 rw.gb AUC = 0.99 Results after reweigthing: classifier AUC = 0.95 rw.gb AUC = 0.55 Could you help me to identify a possible cause of this difference in behavior?

jalvear2dxc avatar May 21 '21 09:05 jalvear2dxc

Hi @jalvear2dxc, I'm not completely following which classifiers you compare, but large difference you report is possible.

Naturally, reweighing would remove discrepancies that are picked by models with tree configuration (e.g. depth) that is similar to reweighter's trees. If you use uniforming loss, this may become an additional hint to classifier (though hard to predict without understanding/pondering the data).

Also, check that you use correct weights in every training and in every AUC scoring. Just in case.

arogozhnikov avatar May 27 '21 05:05 arogozhnikov

Thanks Mr. @arogozhnikov.

I've improved dramatically the results not training a new classifier after the reweighting but just correcting the predictions of the firs model with the predicted weigths. Does it make sense? I think this is according with what you said in the answer.

jalvear2dxc avatar May 27 '21 06:05 jalvear2dxc

@jalvear2dxc yes, seems to match with what I suggested

arogozhnikov avatar Jun 01 '21 05:06 arogozhnikov