pytorch-cifar
pytorch-cifar copied to clipboard
Can't achieve the reported accuracy on MobileNetV2
I tried to replicate the experiments on ResNet, VGG, MobileNet, and MobileNetV2.
For ResNet and VGG, actually, I can get better results than reported (around 2% higher).
However, for the MobileNetV2 I can only get to about 90.1%, which is much lower than 94.5% reported.
I wonder if there is something I missed during the training? Or should I apply different learning for MobileNet and V2
I am following everything from the README including the learning rate decay at epoch 150 and 250, optimizers and etc...
same to me, getting accuracy of 91%.
It's same to me. I could only get ~91% accuracy for the MobileNetV2 in this code.
I think this is because the downsample layers are too many(three downsample layers), leading to only 4*4 images before the last avgpooling.
Here is the repository https://github.com/tinyalpha/mobileNet-v2_cifar10, which uses only two downsample layers and gets ~94.5% accuracy for the MobileNetV2.
It's same to me. I could only get ~91% accuracy for the MobileNetV2 in this code.
I think this is because the downsample layers are too many(three downsample layers), leading to only 4*4 images before the last avgpooling.
Here is the repository https://github.com/tinyalpha/mobileNet-v2_cifar10, which uses only two downsample layers and gets ~94.5% accuracy for the MobileNetV2.
you are right, the performance on ShuffleNet, ShuffleNetV2, and MobileNet is also not very high.
That is cool. Thanks for the information and the repo. Btw do you Know a shufflenet implementation with better accuracy in cifar10? If I am not incorrect I reached 90% in cifar10 and 70% in cifar100
El jue., 11 abr. 2019 6:59, lollllcat [email protected] escribió:
It's same to me. I could only get ~91% accuracy for the MobileNetV2 in this code.
I think this is because the downsample layers are too many(three downsample layers), leading to only 4*4 images before the last avgpooling.
Here is the repository https://github.com/tinyalpha/mobileNet-v2_cifar10, which uses only two downsample layers and gets ~94.5% accuracy for the MobileNetV2.
you are right, the performance on ShuffleNet, ShuffleNetV2, and MobileNet is also not very high.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kuangliu/pytorch-cifar/issues/74#issuecomment-481965635, or mute the thread https://github.com/notifications/unsubscribe-auth/AOinkmbIbyAbFFLebt-R-EKEHFOH-tV8ks5vfsE4gaJpZM4b8qCs .
That is cool. Thanks for the information and the repo. Btw do you Know a shufflenet implementation with better accuracy in cifar10? If I am not incorrect I reached 90% in cifar10 and 70% in cifar100 El jue., 11 abr. 2019 6:59, lollllcat [email protected] escribió: … It's same to me. I could only get ~91% accuracy for the MobileNetV2 in this code. I think this is because the downsample layers are too many(three downsample layers), leading to only 4*4 images before the last avgpooling. Here is the repository https://github.com/tinyalpha/mobileNet-v2_cifar10, which uses only two downsample layers and gets ~94.5% accuracy for the MobileNetV2. you are right, the performance on ShuffleNet, ShuffleNetV2, and MobileNet is also not very high. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#74 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AOinkmbIbyAbFFLebt-R-EKEHFOH-tV8ks5vfsE4gaJpZM4b8qCs .
I think running the code here you can get around 91% in g2 and 90.8% in g3. Otherwise, I am not sure.
Same here, I only get MobileNet - 89.83%, MobileNetv2 - 92.17%, ShuffleNetG2 - 90.32 I use learning rate 0.1, 0.01 and 0.0001 for 0-25-50-100 (total 100 epochs)
For those who missed it, there is a closed issue on hyperparams of MobileNetV2 at #29, in which @zhaohui-yang pointed out that setting weight decay to 4e-5. My results also confirm this. With everything else the same (sgd, epochs 350, batch size 128, lr 0.1, steps [150, 250]), I got best accuracy at:
- Epoch 282 | Loss: 0.263 | Acc: 91.740% (9174/10000) for weight decay 5e-4
- Epoch 348 | Loss: 0.316 | Acc: 94.200% (9420/10000) for weight decay 4e-5
I've not been able to reproduce the 94.43% accuracy reported in the repo. I changed the weight decay to 4e-5 as suggested and run the code several times with different random seeds but the most I got is 93.7%. Has anyone had the same issue? @thanhmvu did you only update the weight decay? Changing nothing else?
@pdejorge In case it helps, I ran these on a single gpu. I believe all other hyperparams are the same, otherwise I think I would have mentioned those details given that I mentioned the epochs of my best results for specificity. I don't remember if/where I keep the code/log for these, so I'm sorry I can't be more helpful than that