Stand-Alone-Self-Attention icon indicating copy to clipboard operation
Stand-Alone-Self-Attention copied to clipboard

Loss is NaN

Open phongnhhn92 opened this issue 4 years ago • 2 comments

Hello, I am testing your Resnet50 model with stem is True and at the first training step, my loss is NaN and the accuracy is decreasing? Is that a bug? image

Also I didn't see this problem when I train the model ResNet 26.

phongnhhn92 avatar Sep 09 '19 14:09 phongnhhn92

Thanks for your comments. I don't have enough GPUs. So, I couldn't experiments all of ResNet model. Maybe, you can reduce learning_rate. example) 0.01

Thank you !

leaderj1001 avatar Oct 01 '19 07:10 leaderj1001

Hi, I am facing issues with the Resnet50 model training on CIFAR-10. Even with lr of 0.01 it's throwing Nan after around 10 epochs (suddenly), so, I am not quite sure how to train the resnet50 model. Hoping for a quick reply! Thanks.

Just as a note, the resnet38 and 26 did run successfully without Nan.

ksouvik52 avatar Jun 10 '20 16:06 ksouvik52