apex icon indicating copy to clipboard operation
apex copied to clipboard

FusedDenseGeluDense output NAN

Open gongjingcs opened this issue 3 years ago • 1 comments

apex FusedDenseGeluDense ouput nan

After some iterative training, the output is all nan; We checked that the input doesn't have nan

gongjingcs avatar May 31 '22 08:05 gongjingcs

CC @seryilmaz

ptrblck avatar Aug 03 '22 07:08 ptrblck

Any fixes on this? I also notice this to be the case. I noticed that when I am using fp16, the biases for the FusedDenseGeluDense layers were initialized to inf. This is what I think causes nans. Reinitializing the weights and biases seems to have fixed this so far.

apoorv2904 avatar Mar 15 '23 00:03 apoorv2904