fairseq gradient overflow detected, ignoring gradient, setting loss scale to: X

gradient overflow detected, ignoring gradient, setting loss scale to: X

Open shuvohishab opened this issue 2 years ago • 0 comments

❓ Questions and Help

Before asking:

search the issues.
search the docs.

What is your question?

I have a more theoretical inquiry. When I train with more data, I receive a warning that the loss scale is approaching a certain level(e.g.: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 4.0). My inquiry is if this impacts model performance and, if so, to what extent.

Code

I'm finetuning my wav2vec2 model using README.md.

What have you tried?

What's your environment?

fairseq Version (e.g., 1.0 or main): 0.12.2
PyTorch Version (e.g., 1.0): 1.12.1
OS (e.g., Linux): Linux
How you installed fairseq (pip, source): pip install --editable ./
Build command you used (if compiling from source):
Python version: 3.8
CUDA/cuDNN version: 11.3
GPU models and configuration: 1xTesla T4
Any other relevant information:

Oct 04 '22 06:10 shuvohishab

fairseq fairseq copied to clipboard

gradient overflow detected, ignoring gradient, setting loss scale to: X

❓ Questions and Help

Before asking:

What is your question?

Code

What have you tried?

What's your environment?

fairseq
fairseq copied to clipboard