Stand-Alone-Self-Attention icon indicating copy to clipboard operation
Stand-Alone-Self-Attention copied to clipboard

Can anyone train resnet50 successfully without NaN

Open ksouvik52 opened this issue 4 years ago • 3 comments

Hi, I am facing issues with the Resnet50 model training on CIFAR-10. Even with lr of 0.01 its throwing Nan after around 10 epochs (suddenly), so, I am not quite sure how to train the resnet50 model. Hoping for a quick reply! Thanks.

ksouvik52 avatar Jun 10 '20 16:06 ksouvik52

I am also having the same issue. Did you solve it yet?

sammens avatar Jul 08 '20 08:07 sammens

add BN for generated Q, K, V

theFoxofSky avatar Feb 18 '21 16:02 theFoxofSky

Can you elaborate @theFoxofSky ?

danielmimimi avatar Mar 22 '24 15:03 danielmimimi