mx-lsoftmax
mx-lsoftmax copied to clipboard
nan value
I reimplemented in tensorflow, but find it is hard to train. It is very easy get nan value. How could to avoid this?
The output loss value:
step 3600, training loss 1.00463
step 3700, training loss 0.970242
step 3800, training loss 0.906492
step 3900, training loss 0.0988686
step 4000, training loss 0.00080747
step 4100, training loss 0.000444604
step 4200, training loss 0.000204534
step 4300, training loss 0.000227041
step 4400, training loss 0.000149651
step 4500, training loss 0.000162705
step 4600, training loss 9.66944e-05
step 4700, training loss 8.69876e-05
step 4800, training loss 6.04607e-05
step 4900, training loss 8.24705e-05
step 5000, training loss 6.02255e-05
step 5100, training loss 4.36621e-05
step 5200, training loss 3.89259e-05
.....
step 18500, training loss 3.72529e-09
step 18600, training loss 7.45058e-09
step 18700, training loss 2.79397e-09
step 18800, training loss 5.58794e-09
step 18900, training loss 3.72529e-09
step 19000, training loss 9.31323e-10
step 19100, training loss 1.86265e-09
step 19200, training loss 2.79397e-09
step 19300, training loss 1.86265e-09
step 19400, training loss 4.65661e-09
step 19500, training loss 2.79397e-09
step 19600, training loss 5.58794e-09
step 19700, training loss 9.31323e-10
step 19800, training loss 1.86265e-09
step 19900, training loss 9.31323e-10
step 20000, training loss -0
What's the meaning of -0 in loss function
I have no experience with TensorFlow. Did you train the model on MNIST or other dataset. It's weird to see loss value to be that low, how's the test loss? And for -0
, sorry, I have no idea. Maybe it's related to the implement of SoftmaxLoss in TensorFlow.
For training with LSoftmax. You may refer to the advice given by the author here.
Thanks! I know nothing about mxnet. I want to test the mxnet code. When I run the program. I got the following output AttributeError: module 'mxnet.symbol' has no attribute 'LSoftmax' How to solve it? I only install mxnet python version by pip. Should I install the c++ version?
I found it is really hard to convergence. I compared my output cos_m_t and other parameters. We have same output, but my model can't convergence. I changed the m value to 1 in which case is the same as original softmax it can convergence, but when I changed the m value to 2, it really hard to convergence.
@auroua Did you find a solution?I also have this problem。
@luoyetx @auroua @DL-Chang I also have this problem. How to solve this problem?