uncertainty-baselines icon indicating copy to clipboard operation
uncertainty-baselines copied to clipboard

Reproducing OOD scores for CIFAR-10 vs SVHN

Open christophbrgr opened this issue 3 years ago • 5 comments

Hi there,

I'm currently trying to reproduce your results from the SNGP paper (https://arxiv.org/abs/2006.10108) and get a much higher AUCPR for the baseline deterministic WideResNet separating CIFAR-10 from SVHN as OOD set. I don't really see how the result for CIFAR-10 can also be worse than for CIFAR-100 since this relationship is reversed for all other tested methods.

Would be cool if you could have a look and advise if this is a typo in the paper or share how you arrive at the OOD scores for this setup. My results for the OOD task with a vanilla WRN are AUPR = 0.899 and AUROC = 0.931. Screenshot 2021-01-26 at 08 28 51

Thanks a lot!

christophbrgr avatar Jan 29 '21 07:01 christophbrgr

Hi, can you please point me to the script that reproduces the OOD results? Thanks

mdabbah avatar Mar 03 '21 08:03 mdabbah

@mdabbah do you mean the script in this repository here or the code I used to reproduce the results?

christophbrgr avatar Mar 05 '21 10:03 christophbrgr

the script in this repository

mdabbah avatar Mar 05 '21 11:03 mdabbah

Hi Christoph!

So sorry for the late reply. Just want to make sure, did you treat OOD as positive label or negative label? (asking since AUPR is sensitive to label imbalance), and do you mind sharing the AUPR value in both case (i.e., the AUPR for the case OOD is treated as positive label, and the case that OOD is treated as negative label).

Thanks! Jeremiah

jereliu avatar May 11 '21 16:05 jereliu

Hello,

I was trying to reproduce the results from paper for deterministic WideResNet and SNGP WideResNet model trained on cifar10 and SVHN and CIFAR100 as OOD set. I have attached my results below. wideresnet_cifar10_results command used: python3 baselines/cifar/deterministic.py --dataset=cifar10 --num_cores=2 --data_dir=./tensorflow_datasets --output_dir=./model_detr --use_gpu=true --download_data=True python3 baselines/cifar/sngp.py --dataset=cifar10 --num_cores=2 --data_dir=./tensorflow_datasets --output_dir=./model --use_gpu=true --download_data=True The deterministic OOD scores seem very different from paper. Can you help us see why that could be?

Thanks

Meghana4 avatar Mar 14 '24 14:03 Meghana4