vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

Realtime STT with large model

Open Camel-RD opened this issue 2 years ago • 9 comments

I wanted to test realtime speach recognition with large model (vosk-model-en-us-0.22) for extra accuracy. When i use result from call to PartialResult after AcceptWaveform delay is mostley acceptable, but call to Result or FinalResult can take quite a long time. I do understand that large model is perhaps not designed for realtime processing, but result from PartialResult looks usable. Does call to Result or FinalResult provide extra improvement over PartialResult? Or would it be possible to use only partial result and perhaps skip that final processing.

Camel-RD avatar May 03 '22 16:05 Camel-RD

It is recommended to use final result. Final result should be more or less fast, what is your hardware it is slow for you?

nshmyrev avatar May 03 '22 17:05 nshmyrev

Im testing on Ryzen 5 3600 and calls to Result can take more than 4 seconds and it happens quite often. That is for large model (vosk-model-en-us-0.22), for smaller models (vosk-model-en-us-0.22-lgraph and vosk-model-small-en-us-0.15) delay is small.

Camel-RD avatar May 03 '22 21:05 Camel-RD

How much memory? Try to remove rnnlm folder from the model, it should react faster.

nshmyrev avatar May 03 '22 21:05 nshmyrev

Memory is 16gb. I did remove rnnlm folder and that delay is gone. And i dont have to wait like 30 seconds for model to load.

Camel-RD avatar May 03 '22 22:05 Camel-RD

Ok, probably half of that is busy with other tasks and the remaining is not enough.

nshmyrev avatar May 03 '22 22:05 nshmyrev

I dont think memory is the issue, its used around 50%.

Camel-RD avatar May 03 '22 22:05 Camel-RD

Ok, does it work without RNNLM? Also, can you please update to 0.3.38, it had some performance fixes.

nshmyrev avatar May 24 '22 12:05 nshmyrev

It does work without RNNLM. I tested with update 0.3.38, its still the same as before - call to Result() can take about 5 seconds (with RNNLM).

Camel-RD avatar May 24 '22 15:05 Camel-RD

Ok, thank you for the testing. Lets count RNNLM is too slow for your hardware.

nshmyrev avatar May 24 '22 15:05 nshmyrev