BERT-pytorch Why not use torch.no_grad when evaluating test data?

Why not use torch.no_grad when evaluating test data?

Open EvanZ opened this issue 3 years ago • 1 comments

The way the trainer is set up the iteration that is used for train and test is similar except when train step is run the backwards propagation occurs. But one other thing I typically see different between test and train is that in the test batch with torch.no_grad() is used so that, for example, dropout is not applied. Was there any reason this isn't used here?

Jul 03 '21 18:07 EvanZ

I think it should use torch.no_grad(). Or it will run out of GPU memory.

Aug 03 '22 11:08 Guo-Stone

BERT-pytorch BERT-pytorch copied to clipboard

Why not use torch.no_grad when evaluating test data?

BERT-pytorch
BERT-pytorch copied to clipboard