Swin-Transformer icon indicating copy to clipboard operation
Swin-Transformer copied to clipboard

Cannot Reproduce Swin Small Results ( Only achieves 82.9 top 1)

Open achen46 opened this issue 3 years ago • 4 comments

Hi @zeliu98

Thanks for this great work. I am trying to reproduce the results reported in the paper for Swin Small architecture using the exact same hyper-parameters as published in config files. Specifically, I am using 8 V100 GPUs ( which I suppose is same used in the paper) by running the following command line:

python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345  main.py \
--cfg configs/swin_small_patch4_window7_224.yaml --data-path <imagenet-path> --batch-size 128 

But the best top 1 accuracy I can get is 82.976 which falls short of the reported 83.2. Also I have attached the log for the best training run in case it can be useful. log_rank0.txt

How can we achieve the accuracy of 83.2 ?

I really appreciate your response as it has taken me a lot of time to reproduce your results and cannot achieve it in anyways.

achen46 avatar Mar 10 '22 00:03 achen46

For the record, I am not the official authors of Swin-transformer. To my personal experience, most of deep learning models are hard to reproduce the same results as their paper reported. The reason is the random seed, which influence many aspects of a model. For example, parameter initialization. And it is somehow very difficult to reproduce same results even the random seed is fixed, as your framework (e.g., pytorch) may suffer from randomness where parallelization are employed. For more information about reproducibility, you may want to search the google. For your problem, my advice is fix the random seed to the same as the author used in their experiments.

pengzhangzhi avatar Mar 10 '22 08:03 pengzhangzhi

In this setup, we are using the exact same seeds as set by the authors here. So why can't we achieve anything close to 83.20 as claimed in the paper ? we are also using same hardware setup.

On ImageNet, even small percentage matters. That's how a new SOTA is claimed. I hope @zeliu98 and other authors can look into this. I am strictly following their config and recommended seed so reproducible results should be possible.

achen46 avatar Mar 10 '22 18:03 achen46

Adding others for visibility @ancientmooner @caoyue10

achen46 avatar Mar 10 '22 19:03 achen46

Which subset did you choose to evaluate on? test or val set? I cannot find any ground truth of test set.

inkzk avatar Mar 24 '22 09:03 inkzk