RealtimeSTT icon indicating copy to clipboard operation
RealtimeSTT copied to clipboard

Is there a way to use it with a web app instead of native mic

Open abhishek-tg opened this issue 1 year ago • 6 comments

I was just trying to create a web app and wanted to modify this to use it to a web app like from JS. Is there a sample?

abhishek-tg avatar Dec 14 '23 11:12 abhishek-tg

Currently only a python webclient sample.

KoljaB avatar Dec 14 '23 13:12 KoljaB

Just realized that the current client does not record and send chunks.

This is a useful feature and very needed. Need to think about how to integrate taking chunks into the API, will then provide a JS client.

KoljaB avatar Dec 14 '23 13:12 KoljaB

Also it uses pyaudio input stream which will be changed to socket queue or something.

abhishek-tg avatar Dec 15 '23 09:12 abhishek-tg

Available now, please check this example with the new v0.1.8 Version.

KoljaB avatar Dec 15 '23 12:12 KoljaB

Thanks i was able to modify it and use it frequently, however I have a question when using for multiple users how will i handle a recorder thread?, will we have multiple threads or a unique id to distinguish the speech classified between users.

abhishek-tg avatar Dec 20 '23 12:12 abhishek-tg

Depends on what you want to achieve. Handling multiple user inputs in parallel will be not easy, especially if you want to also realtime transcribe. First you'd need to change RealtimeSTT for this, the processing is not designed for multiple incoming audio chunk feeds. You would need to create multiple worker threads for every feed. While a user talks the server needs to do voice activity detection and transcription, which needs VRAM and causes load on GPU. So either you'd need to load balance VAD and transcription somehow or you'd need big amounts of VRAM and GPU power on the server to handle that.

KoljaB avatar Dec 20 '23 15:12 KoljaB