vosk-api
vosk-api copied to clipboard
Support for decoding of TDNN-LSTM nnet3?
vosk api is defautly support for decoding of TDNN nnet3.
I did some experiments, the TDNN-LSTM is obviously better than TDNN and the TDNN-LSTM get 3% gain comparing with TDNN.
The TDNN-LSTM needs the parameter as follows:
extra_left_context=0 extra_right_context=0 extra_left_context_initial=-1 extra_right_context_final=-1
but i can not find these parameters in src codes.
You can set these parameters in model/conf/model.conf I believe, please try.
@nshmyrev I try it ,get some errors info:
Command line was: ERROR (VoskAPI:ReadConfigFile():parse-options.cc:493) Invalid option --extra-left-context 50 in config file model_tdnn_lstm//conf/model.conf terminate called after throwing an instance of 'kaldi::KaldiFatalError' what(): kaldi::KaldiFatalError Aborted (core dumped)
--extra-left-context 50
Looks like you forgot =, it should be --extra-left-context=50
--extra-left-context 50
Looks like you forgot =, it should be --extra-left-context=50
correct it, i still get same error, maybe this net can not be supported:
Command line was:
ERROR (VoskAPI:ReadConfigFile():parse-options.cc:493) Invalid option --extra-left-context=50 in config file model_tdnn_lstm//conf/model.conf
terminate called after throwing an instance of 'kaldi::KaldiFatalError'
what(): kaldi::KaldiFatalError
Aborted (core dumped)
@nshmyrev how should I modify the src codes,only “--extra-left-context-initial” is supported.
it seems that online lstm decoding has not been implemented https://github.com/kaldi-asr/kaldi/issues/1091
Hi @nshmyrev I have trained my own model and the performance of the model is great in terms of both accuracy and speech. I found that increasing the beam provides better accuracy but changing the lattice beam from model.conf had no effects.
--min-active=200 --max-active=3000 --beam=12.0 --lattice-beam=5.0 --acoustic-scale=1.0 --frame-subsampling-factor=3 --endpoint.silence-phones=1:2:3:4:5:6:7:8:9:10 --endpoint.rule2.min-trailing-silence=0.5 --endpoint.rule3.min-trailing-silence=0.75 --endpoint.rule4.min-trailing-silence=1.0
This is my model.conf file. What do you think the values for beam and lattice_beam and also other options would provide better accuracy without compromise in performance?
you should open a new issue, it seems that your issue is not related with mine