loss_function_search
loss_function_search copied to clipboard
Numerical instability and weirdness of the softmax function.
Thanks for your outstanding work.
After reading your paper, I carefully analyze your code. I found out that you used pytorch api function prob = F.softmax(pred, dim=1)
.
Based on my experience, softmax sometimes can be numerically unstable (give overflow or underflow errors) or useless (all the outputs are the same or weird). So my question is how did you solve the instability problem?