pytorch-vgg-cifar10
pytorch-vgg-cifar10 copied to clipboard
I run your code and always nan loss, can you help me?
I just run with ./run.sh and got nan loss after a few steps. Here is the printed log:
(base) root@For-Judy-And-Ian:~/pytorchProjects/pytorch-vgg-cifar10-master# ./run.sh python main.py --arch=vgg11 --save-dir=save_vgg11 |& tee -a log_vgg11 Files already downloaded and verified Epoch: [0 ][ 0 /391] Time 0.831 (0.831) Data 0.190 (0.190) Loss 2.3037 (2.3037) Prec@1 10.938 (10.938) Epoch: [0 ][20 /391] Time 0.018 (0.052) Data 0.000 (0.009) Loss 2.2982 (2.3029) Prec@1 9.375 (9.487) Epoch: [0 ][40 /391] Time 0.012 (0.035) Data 0.000 (0.005) Loss 2.2928 (2.3018) Prec@1 12.500 (9.546) Epoch: [0 ][60 /391] Time 0.012 (0.030) Data 0.000 (0.003) Loss 2.2685 (2.2970) Prec@1 15.625 (10.720) Epoch: [0 ][80 /391] Time 0.012 (0.027) Data 0.000 (0.002) Loss 2.1417 (2.2787) Prec@1 21.875 (11.960) Epoch: [0 ][100/391] Time 0.016 (0.026) Data 0.000 (0.002) Loss 2.1417 (2.2518) Prec@1 22.656 (13.134) Epoch: [0 ][120/391] Time 0.029 (0.024) Data 0.000 (0.002) Loss 1.9975 (2.2189) Prec@1 21.094 (14.463) Epoch: [0 ][140/391] Time 0.028 (0.024) Data 0.000 (0.002) Loss 2.0889 (2.1959) Prec@1 26.562 (15.459) Epoch: [0 ][160/391] Time 0.018 (0.023) Data 0.000 (0.001) Loss 2.0179 (2.1856) Prec@1 21.875 (16.193) Epoch: [0 ][180/391] Time 0.012 (0.023) Data 0.000 (0.001) Loss 1.9825 (2.1645) Prec@1 25.000 (16.894) Epoch: [0 ][200/391] Time 0.012 (0.022) Data 0.000 (0.001) Loss 1.8724 (2.1434) Prec@1 27.344 (17.623) Epoch: [0 ][220/391] Time 0.012 (0.022) Data 0.000 (0.001) Loss 2.0147 (2.1258) Prec@1 25.000 (18.121) Epoch: [0 ][240/391] Time 0.012 (0.021) Data 0.000 (0.001) Loss 1.8679 (2.1128) Prec@1 22.656 (18.458) Epoch: [0 ][260/391] Time 0.016 (0.021) Data 0.000 (0.001) Loss 1.8262 (2.0923) Prec@1 28.125 (19.202) Epoch: [0 ][280/391] Time 0.012 (0.021) Data 0.000 (0.001) Loss 1.7779 (2.0737) Prec@1 31.250 (19.834) Epoch: [0 ][300/391] Time 0.011 (0.020) Data 0.000 (0.001) Loss 1.7415 (2.0569) Prec@1 38.281 (20.359) Epoch: [0 ][320/391] Time 0.012 (0.020) Data 0.000 (0.001) Loss 1.7895 (2.0431) Prec@1 26.562 (20.863) Epoch: [0 ][340/391] Time 0.012 (0.020) Data 0.000 (0.001) Loss 1.7198 (2.0292) Prec@1 31.250 (21.355) Epoch: [0 ][360/391] Time 0.012 (0.019) Data 0.000 (0.001) Loss 1.9042 (2.0171) Prec@1 27.344 (21.827) Epoch: [0 ][380/391] Time 0.012 (0.019) Data 0.000 (0.001) Loss 2.6430 (2.0338) Prec@1 12.500 (21.900) Test[0/79] Time 0.136 (0.136) Loss 2.3228 (2.3228) Prec@1 10.938 (10.938) Test[20/79] Time 0.004 (0.013) Loss 2.3267 (2.3337) Prec@1 7.812 (8.891) Test[40/79] Time 0.013 (0.009) Loss 2.3235 (2.3322) Prec@1 10.156 (8.670) Test[60/79] Time 0.011 (0.009) Loss 2.3311 (2.3303) Prec@1 10.156 (8.799) * Prec@1 8.810 Epoch: [1 ][ 0 /391] Time 0.099 (0.099) Data 0.085 (0.085) Loss 2.3538 (2.3538) Prec@1 8.594 (8.594) Epoch: [1 ][20 /391] Time 0.028 (0.021) Data 0.000 (0.005) Loss nan (nan) Prec@1 1.562 (8.036) Epoch: [1 ][40 /391] Time 0.018 (0.019) Data 0.000 (0.003) Loss nan (nan) Prec@1 1.562 (5.011) Epoch: [1 ][60 /391] Time 0.012 (0.017) Data 0.000 (0.002) Loss nan (nan) Prec@1 1.562 (3.893) Epoch: [1 ][80 /391] Time 0.012 (0.016) Data 0.000 (0.002) Loss nan (nan) Prec@1 2.344 (3.279) Epoch: [1 ][100/391] Time 0.013 (0.016) Data 0.002 (0.001) Loss nan (nan) Prec@1 2.344 (2.908) Epoch: [1 ][120/391] Time 0.017 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 3.906 (2.686) Epoch: [1 ][140/391] Time 0.012 (0.016) Data 0.000 (0.001) Loss nan (nan) Prec@1 2.344 (2.549) Epoch: [1 ][160/391] Time 0.012 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 2.344 (2.451) Epoch: [1 ][180/391] Time 0.017 (0.016) Data 0.000 (0.001) Loss nan (nan) Prec@1 3.906 (2.348) Epoch: [1 ][200/391] Time 0.012 (0.016) Data 0.000 (0.001) Loss nan (nan) Prec@1 2.344 (2.320) Epoch: [1 ][220/391] Time 0.011 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 0.781 (2.238) Epoch: [1 ][240/391] Time 0.012 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 0.781 (2.217) Epoch: [1 ][260/391] Time 0.013 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 1.562 (2.176) Epoch: [1 ][280/391] Time 0.012 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 2.344 (2.149) Epoch: [1 ][300/391] Time 0.016 (0.015) Data 0.000 (0.001) Loss nan (nan) Prec@1 1.562 (2.108) Epoch: [1 ][320/391] Time 0.018 (0.015) Data 0.007 (0.001) Loss nan (nan) Prec@1 0.781 (2.078) Epoch: [1 ][340/391] Time 0.017 (0.015) Data 0.006 (0.001) Loss nan (nan) Prec@1 3.125 (2.067) Epoch: [1 ][360/391] Time 0.018 (0.015) Data 0.006 (0.001) Loss nan (nan) Prec@1 3.906 (2.052) Epoch: [1 ][380/391] Time 0.012 (0.016) Data 0.000 (0.001) Loss nan (nan) Prec@1 0.781 (2.010) Test[0/79] Time 0.094 (0.094) Loss nan (nan) Prec@1 0.000 (0.000) Test[20/79] Time 0.009 (0.014) Loss nan (nan) Prec@1 0.000 (0.335) Test[40/79] Time 0.015 (0.015) Loss nan (nan) Prec@1 0.000 (0.419) Test[60/79] Time 0.015 (0.015) Loss nan (nan) Prec@1 0.000 (0.538) * Prec@1 0.540
change your learning rate to 0.001.