nice_pytorch
nice_pytorch copied to clipboard
Validation Loss Statistics: min=nan, med=nan, mean=nan, max=nan
Thank you for the code!
When I ran the code on mnist like this python train.py --dataset mnist
, I got the output Validation Loss Statistics: min=nan, med=nan, mean=nan, max=nan
after training for several epochs.
Please help me, I have no idea what the matter is.
I am having the same issue. Any fixes for this?
The model collapse very quickly with exploded loss. I want to figure out why.
It seems nothing to do with initialization methods.
I found it! the adam parameter beta_2 should never be 0.01 :)
I found it! the adam parameter beta_2 should never be 0.01 :)
Still collapse after few epochs :(
Seems like collapse to one mode after 2/3,000 iterations, is here anybody can give a reason?
Seems like collapse to one mode after 2/3,000 iterations, is here anybody can give a reason?
The norm of h, i.e., f(x) gets bigger and bigger as training evolves, leading to the torch.exp(h) term approaches to Infinity. I looked into the model parameters and found the norm of them also increased a lot in training. One way to mitigate the issue is by using L1 regularization. It would also help if you use F.softplus to make the logistic prior calculation stable.