fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

How to reproduce the WER score of Wav2vec-U

Open cywang97 opened this issue 3 years ago • 1 comments

❓ Questions and Help

What is your question?

I have run the pipeline of wav2vec-U and trained a GAN on Librispeech 960h. When using Viterbi decoding, I can get the PER of about 20%. But when I use kaldi decoder, I got WER as 40%. Will you release the inference script with an LM so that we can reproduce the WER score reported in the paper?

Code

The arguments I used for decoding are as follows: w2l_decoder: KALDI post_process: silence blank_weight: 0 sil_is_blank: true blank_mode: add unsupervised_tuning: 0 targets: wrd lexicon: /path/to/text/lexicon_filtered.lst kaldi_decoder_config: acoustic_scale: 5 hlg_graph_path: /path/to/text/fst/phn_to_words_sil/HLG.phn.kenlm.wrd.o40003.fst output_dict: /path/to/text/fst/phn_to_words_sil/kaldi_dict.kenlm.wrd.o40003.txt

What have you tried?

What's your environment?

  • fairseq Version (e.g., 1.0 or main):
  • PyTorch Version (e.g., 1.0)
  • OS (e.g., Linux):
  • How you installed fairseq (pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

cywang97 avatar Apr 28 '22 12:04 cywang97

我的uer一直是九十多,降不下来

XR1988 avatar Dec 09 '24 07:12 XR1988