pytorch-cifar MobileNetV2 training does not converge

MobileNetV2 training does not converge

Open cherry-licongyi opened this issue 3 years ago • 1 comments

I use the mobilenetv2 code in Repository training on the CIFAR10 data set, but does not converge, do not know what reason, any answer would be appreciated

here is the training log

==> Preparing data..
==> Building model..

Epoch: 0
 [=========================== 391/391 ============================>]  Step: 2s840ms | Tot: 29s605ms | Loss: 2.304 | Acc: 9.868% (4934/50000)                                                  
 [=========================== 100/100 ============================>]  Step: 39ms | Tot: 3s794ms | Loss: 2.304 | Acc: 10.000% (1000/10000)                                                     
Saving..

Epoch: 1
 [=========================== 391/391 ============================>]  Step: 52ms | Tot: 26s718ms | Loss: 2.304 | Acc: 9.674% (4837/50000)                                                     
 [=========================== 100/100 ============================>]  Step: 42ms | Tot: 3s526ms | Loss: 2.304 | Acc: 10.000% (1000/10000)                                                     

Epoch: 2
 [=========================== 391/391 ============================>]  Step: 63ms | Tot: 27s702ms | Loss: 2.304 | Acc: 9.836% (4918/50000)                                                     
 [=========================== 100/100 ============================>]  Step: 36ms | Tot: 3s642ms | Loss: 2.303 | Acc: 10.000% (1000/10000)                                                     

Epoch: 3
 [=========================== 391/391 ============================>]  Step: 53ms | Tot: 26s19ms | Loss: 2.304 | Acc: 10.154% (5077/50000)                                                     
 [=========================== 100/100 ============================>]  Step: 36ms | Tot: 3s463ms | Loss: 2.304 | Acc: 10.000% (1000/10000)                                                     

Epoch: 4
 [=========================== 391/391 ============================>]  Step: 61ms | Tot: 26s405ms | Loss: 2.304 | Acc: 10.148% (5074/50000)                                                    
 [=========================== 100/100 ============================>]  Step: 36ms | Tot: 3s632ms | Loss: 2.304 | Acc: 10.000% (1000/10000)                                                     

Epoch: 5
 [=========================== 391/391 ============================>]  Step: 52ms | Tot: 26s824ms | Loss: 2.304 | Acc: 9.966% (4983/50000)                                                     
 [=========================== 100/100 ============================>]  Step: 34ms | Tot: 3s266ms | Loss: 2.305 | Acc: 10.000% (1000/10000)                                                     

Epoch: 6
 [=========================== 391/391 ============================>]  Step: 64ms | Tot: 25s992ms | Loss: 2.304 | Acc: 10.260% (5130/50000)                                                    
 [=========================== 100/100 ============================>]  Step: 33ms | Tot: 3s639ms | Loss: 2.304 | Acc: 10.000% (1000/10000)                                                     

Epoch: 7
 [=========================== 391/391 ============================>]  Step: 58ms | Tot: 26s53ms | Loss: 2.304 | Acc: 9.936% (4968/50000)                                                      
 [=========================== 100/100 ============================>]  Step: 34ms | Tot: 3s475ms | Loss: 2.304 | Acc: 10.000% (1000/10000)                                                     

Epoch: 8
^CTraceback (most recent call last): ..............................]  Step: 71ms | Tot: 6s349ms | Loss: 2.305 | Acc: 10.205% (1267/12416)

Dec 17 '21 07:12 cherry-licongyi

Try a smaller learning rate?

Aug 06 '22 17:08 logan-mo

pytorch-cifar pytorch-cifar copied to clipboard

MobileNetV2 training does not converge

pytorch-cifar
pytorch-cifar copied to clipboard