Transcription isn't triggered in absence of frames
I'm using the websocket server with a remote TypeScript client. The client uses its own simplistic VAD filter to avoid sending frames with no voice. Because of this filter, final transcription isn't done as there is simply no frame to trigger it. Removing the client filter solves the issue, but it's less than ideal because I'd like to avoid continuously sending frames over the wire. Perhaps a timeout could be added to the recorder that triggers the time checks in case no frames have been received?
Maybe, yes. Unsure if this add much to RealtimeSTT. You could just send silence in this case.
Sure, but this is somewhat hacky and unexpected. I spent quite some time figuring out why transcription isn't triggering on silence only to realize that I'm too silent. I think this at least warrants a mention in the readme.