kaldi-gstreamer-server icon indicating copy to clipboard operation
kaldi-gstreamer-server copied to clipboard

How to tune Kaldi gsteamer for using NNET3 decoder

Open purijs opened this issue 4 years ago • 3 comments

The different gstreamer versions available only use nnet2 decoder. Is there a way I can use nnet3 decoder with nnet3 model. I know about the nnet3 mode but it's not accurate.

purijs avatar Apr 29 '20 06:04 purijs

Our nnet3-based decoder is implemented based on nnet3 decoder in Kaldi. I have seen claims that our decoder is not as accurate as Kaldi's. I believe there could be a bug somewhere in our decoder. If you want to help, you should either try to find the bug yourself or prepare a small test set (acoustic model + graph + an utterance) where you can consistently show that the results between Kaldi and our decoder are different.

alumae avatar Apr 29 '20 06:04 alumae

Our nnet3-based decoder is implemented based on nnet3 decoder in Kaldi. I have seen claims that our decoder is not as accurate as Kaldi's. I believe there could be a bug somewhere in our decoder. If you want to help, you should either try to find the bug yourself or prepare a small test set (acoustic model + graph + an utterance) where you can consistently show that the results between Kaldi and our decoder are different.

Are you talking about using "nnet3 mode"? Unfortunately, I won't be able to share to data/transcripts but will try to work on some open source audios.

Just wanted to confirm if "nnet3 mode" was the only option to use nnet3 decoder as nnet2 is also True

purijs avatar Apr 30 '20 03:04 purijs

Yes, "nnet3 mode" is the only mode to use nnet3 models. It has historical reasons. Originally kaldi-gstreamer-server supported only GMM models (because DNN models were not used at time -- it was long ago). Then I implemented the nnet2 GStreamer plugin. Then came nnet3 in Kaldi. Instead of doing a nnet3 GStreamer plugin from scratch, we implemented nnet3 supprt in the nnet2 plugin. But it's doing pretty much what Kaldi's nnet3 decoder is doing, AFAIK. But apparently there might be a small bug somewhere.

alumae avatar May 01 '20 15:05 alumae