fairseq
fairseq copied to clipboard
gradient overflow detected, ignoring gradient, setting loss scale to: X
❓ Questions and Help
Before asking:
- search the issues.
- search the docs.
What is your question?
I have a more theoretical inquiry. When I train with more data, I receive a warning that the loss scale is approaching a certain level(e.g.: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 4.0
).
My inquiry is if this impacts model performance and, if so, to what extent.
Code
I'm finetuning my wav2vec2 model using README.md.
What have you tried?
What's your environment?
- fairseq Version (e.g., 1.0 or main): 0.12.2
- PyTorch Version (e.g., 1.0): 1.12.1
- OS (e.g., Linux): Linux
- How you installed fairseq (
pip
, source): pip install --editable ./ - Build command you used (if compiling from source):
- Python version: 3.8
- CUDA/cuDNN version: 11.3
- GPU models and configuration: 1xTesla T4
- Any other relevant information: