Scenario-Wise-Rec icon indicating copy to clipboard operation
Scenario-Wise-Rec copied to clipboard

There is a problem in AliCCP dataset

Open eldercola opened this issue 1 year ago • 1 comments
trafficstars

There are only negative samples in the second domain of AliCCP's '301' feature field, it occurs when I'm calculating the AUC and logloss of that domain. (-1 is the special value to handle this ValueError) 截屏2024-05-08 10 42 37 好像AliCCP数据集的第二个domain里面都是负样本,导致我在算AUC和Log Loss的时候都出现ValueError,原因是该domain的样本只有一种标签,然后我查了一下,发现都是负样本

eldercola avatar May 08 '24 02:05 eldercola

Thanks for your attention to our benchmark. It seems you do not use the proper training dataset. Could you please check again to see if you used the proper training dataset file for Aliccp? There are indeed two types for each domain in Aliccp.

domains 301-1 301-2 301-3
positive 634783 13923 995550
negative 15250588 304950 25100111

Xiaopengli1 avatar May 08 '24 03:05 Xiaopengli1