Self-Tuning-Networks Validation loss goes extremely large or NaN

Hi, while I training ResNet with DeltaSTN, I found that the response valid loss went extremely large or became NaN after warmup. May I ask do you have any suggestions for this issue? Thank you very much!

Nov 24 '22 11:11 kevinchan04

Hello, thank you for your interest in the code. Could you share which PyTorch version you are using?

Nov 24 '22 15:11 pomonam

Hello, thank you for your reply! I am using pytorch 1.12.1 now.

Nov 25 '22 00:11 kevinchan04

I believe that to reproduce the results, it might be better to use PyTorch 1.5.1. Depending on small changes (e.g., initialization), you might get different results. If you want to use the version 1.12.1, you can try decreasing the learning rate.

Nov 25 '22 00:11 pomonam

Thank you for your reply! I fixed it.

Jan 11 '23 03:01 kevinchan04