I-BERT
I-BERT copied to clipboard
Fix NaN values in IntLayerNorm
Add an epsilon to std_int in the forward pass of IntLayerNorm to prevent NaN values due to multiplication of 0 by infinity.
The same result can alternatively be achieved by calling torch.nan_to_num(y_int).
Fixes #31.