fairseq
fairseq copied to clipboard
How to reproduce the WER score of Wav2vec-U
❓ Questions and Help
What is your question?
I have run the pipeline of wav2vec-U and trained a GAN on Librispeech 960h. When using Viterbi decoding, I can get the PER of about 20%. But when I use kaldi decoder, I got WER as 40%. Will you release the inference script with an LM so that we can reproduce the WER score reported in the paper?
Code
The arguments I used for decoding are as follows: w2l_decoder: KALDI post_process: silence blank_weight: 0 sil_is_blank: true blank_mode: add unsupervised_tuning: 0 targets: wrd lexicon: /path/to/text/lexicon_filtered.lst kaldi_decoder_config: acoustic_scale: 5 hlg_graph_path: /path/to/text/fst/phn_to_words_sil/HLG.phn.kenlm.wrd.o40003.fst output_dict: /path/to/text/fst/phn_to_words_sil/kaldi_dict.kenlm.wrd.o40003.txt
What have you tried?
What's your environment?
- fairseq Version (e.g., 1.0 or main):
- PyTorch Version (e.g., 1.0)
- OS (e.g., Linux):
- How you installed fairseq (
pip, source): - Build command you used (if compiling from source):
- Python version:
- CUDA/cuDNN version:
- GPU models and configuration:
- Any other relevant information:
我的uer一直是九十多,降不下来