fairseq
fairseq copied to clipboard
Cannot reproduce the result of ASR task on CoVost 2 dataset (Indonesian)
❓ Questions and Help
What is your question?
Hello! I have been trying to train a fairseq ASR for Indonesian and have not reached a breakthrough. Currently, while training the ASR, it has consistently been giving poor WER scores of >100 (the expected WER is around 20-30). The inference output has been rather perplexing, as there is a repetition of certain terms/ phrases even though they do not resemble the original term remotely. Really hope that someone would be able to provide some guidance on this! Any help is appreciated, thank you!
Code
fairseq-train data/indo/cv-corpus-9.0-2022-04-27/id ^ --config-yaml config_asr_id.yaml --train-subset train_asr_id --valid-subset dev_asr_id ^ --save-dir data/indo/asr --num-workers 4 --max-tokens 50000 --max-update 60000 ^ --task speech_to_text --criterion label_smoothed_cross_entropy --label-smoothing 0.1 ^ --report-accuracy --arch s2t_transformer_s --dropout 0.15 --optimizer adam --lr 2e-3 ^ --lr-scheduler inverse_sqrt --warmup-updates 10000 --clip-norm 10.0 --seed 1 --update-freq 8
fairseq-generate data/indo/cv-corpus-9.0-2022-04-27/id ^ --config-yaml config_asr_id.yaml --gen-subset test_asr_id --task speech_to_text ^ --path data/indo/asr/avg_last_10_checkpoint.pt --max-tokens 50000 --beam 5 ^ --scoring wer --wer-tokenizer 13a --wer-lowercase --wer-remove-punct ^ --results-path data/indo/cv-corpus-9.0-2022-04-27/id/results
What have you tried?
I have tried tuning the dropout values and learning rates, but the improvement in terms of WER has been marginal.
What's your environment?
- fairseq Version: 1.0
- PyTorch Version (e.g., 1.0):
- OS: Windows
- How you installed fairseq (
pip, source): git pull fairseq repository - Build command you used (if compiling from source): pip
- Python version: 3.8.13
- CUDA/cuDNN version:
- GPU models and configuration: Nvidia GeForce RTX 3080
- Any other relevant information: