pytorch-sgns
pytorch-sgns copied to clipboard
Confused by the loss function.
In your code, you minimized -(oloss + nloss).mean()
which means (oloss+nloss) should be large. So, "oloss become large and nloss become small " is expected.
Although -(oloss+nloss) decrease, I got oloss become small and nloss become large, how so?
Thank you for the feedback. Can you provide a reduced, reproducible case sample? Like, small dataset and a configuration for it.
When computing nloss
, the author uses function .neg
to make the nloss smaller when training.