Robust-Vision-Transformer icon indicating copy to clipboard operation
Robust-Vision-Transformer copied to clipboard

A question about "Loss is nan"

Open yafangna opened this issue 1 year ago • 1 comments

Is there any interface in the source code that would cause the loss is nan problem sometimes and the correct operation sometimes?

yafangna avatar Oct 11 '22 07:10 yafangna

"loss is nan" is a problem for original ViT models when amp is turned on. Check here for more details.

Afterwards, many techniques are proposed to solve this problem, e.g., LayerScale. You can refer these techniques for preventing training loss to nan.

vtddggg avatar Oct 12 '22 01:10 vtddggg