Validation loss goes extremely large or NaN
Hi, while I training ResNet with DeltaSTN, I found that the response valid loss went extremely large or became NaN after warmup. May I ask do you have any suggestions for this issue? Thank you very much!
Hello, thank you for your interest in the code. Could you share which PyTorch version you are using?
Hello, thank you for your reply! I am using pytorch 1.12.1 now.
I believe that to reproduce the results, it might be better to use PyTorch 1.5.1. Depending on small changes (e.g., initialization), you might get different results. If you want to use the version 1.12.1, you can try decreasing the learning rate.
Thank you for your reply! I fixed it.