Binbin Zhang
Binbin Zhang
The mismatch of training and inference might be the problem. How about the rescoring accuracy?
@Slyne , please review.
@yuekaizhang is it possible?
Binding is just for CPU and it is too heavy for the task. I prefer to use torchaudio built-in decoder.
what is your pytorch/torchaudio version?
@xingchensong , please follow the issue.
Do you have the right python virtual environment, `espnet` is used from your command line.
It's weird. please check: 1. words.txt: for decoding with LM, you should use the words.txt which is generated by LM tools. Do not use the words.txt which is used for...
We only test it on 200+ hours dataset. However, I think it must be something wrong in your pipeline, the WER is greater than 100%.
Is there any solution to fix it?