ade-czsl
ade-czsl copied to clipboard
loss is negative
Hello, Thanks for your good work. But I have a question, and and look forward to your reply. I ran your code in C-gqa dataset, why did the loss become negative as the training progressed?