vosk-api Realtime STT with large model

Realtime STT with large model

Open Camel-RD opened this issue 2 years ago • 9 comments

I wanted to test realtime speach recognition with large model (vosk-model-en-us-0.22) for extra accuracy. When i use result from call to PartialResult after AcceptWaveform delay is mostley acceptable, but call to Result or FinalResult can take quite a long time. I do understand that large model is perhaps not designed for realtime processing, but result from PartialResult looks usable. Does call to Result or FinalResult provide extra improvement over PartialResult? Or would it be possible to use only partial result and perhaps skip that final processing.

May 03 '22 16:05 Camel-RD

It is recommended to use final result. Final result should be more or less fast, what is your hardware it is slow for you?

May 03 '22 17:05 nshmyrev

Im testing on Ryzen 5 3600 and calls to Result can take more than 4 seconds and it happens quite often. That is for large model (vosk-model-en-us-0.22), for smaller models (vosk-model-en-us-0.22-lgraph and vosk-model-small-en-us-0.15) delay is small.

May 03 '22 21:05 Camel-RD

How much memory? Try to remove rnnlm folder from the model, it should react faster.

May 03 '22 21:05 nshmyrev

Memory is 16gb. I did remove rnnlm folder and that delay is gone. And i dont have to wait like 30 seconds for model to load.

May 03 '22 22:05 Camel-RD

Ok, probably half of that is busy with other tasks and the remaining is not enough.

May 03 '22 22:05 nshmyrev

I dont think memory is the issue, its used around 50%.

May 03 '22 22:05 Camel-RD

Ok, does it work without RNNLM? Also, can you please update to 0.3.38, it had some performance fixes.

May 24 '22 12:05 nshmyrev

It does work without RNNLM. I tested with update 0.3.38, its still the same as before - call to Result() can take about 5 seconds (with RNNLM).

May 24 '22 15:05 Camel-RD

Ok, thank you for the testing. Lets count RNNLM is too slow for your hardware.

May 24 '22 15:05 nshmyrev

vosk-api vosk-api copied to clipboard

Realtime STT with large model

vosk-api
vosk-api copied to clipboard