BERT
BERT copied to clipboard
loss.backward() error
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [128, 4, 2304]], which is output 0 of AddBackward0, i s at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or any where later. Good luck!
Did you know how to solve this problem? Thanks a lot if you can help!