openspeech icon indicating copy to clipboard operation
openspeech copied to clipboard

Potential Issue with EncoderDecoder Model with CrossEntropy Loss

Open gkeskin07 opened this issue 2 years ago • 2 comments

Hi,

I see that the LSTM Attention Decoder takes the log_softmax of the step outputs inside the model. However, cross entropy loss uses nn.CrossEntropyLoss, which takes another log_softmax inside. Shouldn't nn.NLLLoss be used instead of nn.CrossEntropyLoss? This would cause issues in models such as ConformerLSTMModel.

gkeskin07 avatar Jun 23 '22 23:06 gkeskin07

As far as I know, nn.CrossEntropy measure whether log_softmax is applied already and decide whether to apply or not.

sooftware avatar Aug 17 '22 16:08 sooftware

Also, because log_softmax(log_softmax(x)) = log_softmax(x), the result is the same.

upskyy avatar Aug 17 '22 17:08 upskyy