EmoNeXt
EmoNeXt copied to clipboard
train loss and val loss is nan and the val acc and train val becomes 13% and it does not change for subsequent trainings
Hello,
I am trying to replicate the experiment using the "tiny" model. I downloaded the dataset and trained the model as per the steps provided. But at 178th epoch there is a case where the train loss and val loss is nan and the val acc and train val becomes 13% and it does not change for subsequent trainings,Any idea/guidance?
Best!
I would have similar things happen sometimes (vanishing/exploding gradient.) IIRC I just restarted training with the same parameters and it usually worked fine on my next attempt (which didn't make much sense to me because I should be using the same random seed, but ¯_(ツ)_/¯.) You could also try lowering your learning rate if you're still trying to fix your problem.