Swin-Transformer Minor discrepency between training log reported accuracy and evaluation accuracy on ImageNet.

Minor discrepency between training log reported accuracy and evaluation accuracy on ImageNet.

Open FreddieRao opened this issue 3 years ago • 1 comments

Thanks for releasing the code!

However, we use your codebase for training and testing some models and evaluate their performance on ImageNet1K(w/o pretrained on ImageNet22K). However, according to the training log, the max accuracy is 78.58. But when we evaluate the model with evaluation mode by resuming the best-performed checkpoint, the accuracy becomes 78.7. This discrepancy between training log reported accuracy and evaluation accuracy happens on many models.

Here I provide our evaluation script. Would you please let us know why?

CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch \
                                    --nproc_per_node=1 \
                                    --master_port 12345 \
                                    main.py \
                                    --eval \
                                    --cfg configs/our_model.yaml \
                                    --batch-size 2 \
                                    --resume 'output/our_model/best_ckpt.pth' \
                                    --data-path "path/to/dataset" \
                                    --zip

Many thanks!

Apr 28 '21 04:04 FreddieRao

Hi @FreddieRao, thanks for pointing it. You can fix it by using SequentialSampler: https://github.com/microsoft/Swin-Transformer/blob/b05e6214a37d33846903585c9e83b694ef411587/data/build.py#L56-L61

Dec 20 '21 16:12 zeliu98

Swin-Transformer Swin-Transformer copied to clipboard

Minor discrepency between training log reported accuracy and evaluation accuracy on ImageNet.

Swin-Transformer
Swin-Transformer copied to clipboard