RealtimeSTT Can I build a speech recognition processing server for 10 concurrent users?

I want to configure a server that processes the voices of 10 users simultaneously. Then, the corresponding model will be run on the server. Can I implement a system that transmits the voices from 10 user PCs to the server PC, uses the transmitted data to transcribe them in parallel, and transmits the results to the user PCs with the corresponding source?

Here, the condition is to identify the processing of the voice spoken by each user and transmit it appropriately.

May 20 '25 09:05 sangheonEN

Yes you can

Jun 27 '25 14:06 AshGov07

Can you please advise me how to proceed?

Yes you can

Jun 30 '25 03:06 sangheonEN

You can make the AudioToText recorder class call individually for each client/user. This way you can accommodate more than 10 users, but the only concern is that you need to tradeoff in VRAM utilization. If more users, connect the server's VRAM utilization will maximize which will lead to latency kind of issues. If you are okay with it, go on

Jun 30 '25 04:06 AshGov07