Pytorch-Speech-Recognition
Pytorch-Speech-Recognition copied to clipboard
anyone get a good performance?
Hi. I ran this project for korean speech recognition. But loss is not decreasing and i don't get good predictions. I've already used preprocessing method that works well on DeepSpeech and LAS.
It's seems like to DeepSpeech architectures. but not, in the paper use hmm pre-builded on kaldi processing and lf-mmi instead of CTC.
https://www.danielpovey.com/files/2020_interspeech_multistream.pdf
above, like this project reference, using single stream before multi stream. who knows problems or gets good performance using this project?
I tried with LibriSpeech train-clean-100 dataset but WER didn't improved at all. My WER was around 0.95.
I changed 2 things.
- changed stride from strides=[5,2,1] to strides=[2,2,1] to avoid assertion error.
- change sample rate from 48000 to 16000.
I don't know what I need to change...