fairseq
fairseq copied to clipboard
Serious bug in the speech module
🐛 Bug
To Reproduce
I followed the steps in the simul_mustc documentation step by step, but I found that the accuracy of training the ASR model was not high, and found that the model kept output one sentence.I am using convtransformer, I am positioning the conv layer with debug, x is becoming an all-zero 0 matrix, and the conv gradient is also becoming 0.What is more interesting is that the gradient does not disappear gradually, but suddenly disappears in the last batch of epoch1.To make matters worse, this would happen no matter what model or data set I changed, but I didn't have this bug when I tried MT tasks. So I think there's something wrong with the voice module.
Code sample
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python /mnt/fairseq-0.12.2-release/fairseq_cli/train.py /mnt/datasets/origin/en-de --config-yaml config_asr.yaml --train-subset train_asr --valid-subset dev_asr --save-dir /mnt/duyangfan/save --max-tokens 9000 --max-update 100000 --task speech_to_text --criterion label_smoothed_cross_entropy --report-accuracy --arch convtransformer_espnet --optimizer adam --lr 0.0005 --lr-scheduler inverse_sqrt --warmup-updates 10000 --clip-norm 10.0 --seed 1 --keep-last-epochs 5 --max-epoch 100 --ddp-backend=legacy_ddp
Expected behavior
Environment
- fairseq Version0.12.2
- PyTorch Version 2.01
- OS Linux:
- How you installed fairseq source:
- Build command you used :make
- Python version:3.10
- CUDA/cuDNN version:12.3