Fakhir Ali issues

Results 15 issues of


                                            Fakhir Ali

Time tracking

Being able to track the time it takes for the system to respond.

bounty

Transcribe_stream vs transcribe

and listen_stream vs listen. If one is implemented that one should be used.

Queue get() repeated unnessarily

https://github.com/Finity-Alpha/OpenVoiceChat/blob/764a5bf57b524cfbd2eb84a1197126013420d405/openvoicechat/tts/base.py#L129 Here the llm queue may have multiple tokens but the processing(sentence split etc) would be done for every token. Ideally it is done of all of the tokens that...

Elevenlabs streaming latency control

The elevenlabs streaming latency is set to 4. There should be a param to change it. https://github.com/Finity-Alpha/OpenVoiceChat/blob/513b1d014876bb3e2909b3fd1044c352b2729760/openvoicechat/tts/tts_elevenlabs.py#L28

Filler words / sounds "Umm" "hm" after some silence.

This would further reduce perceived latency,

enhancement

Premove LLM request

Send the LLM request before the silence is completely detected. For example if the silence seconds is 2 s, send an LLM request with all the available transcription after 1s...

enhancement

TTS say_multiple_stream needs to be refactored properly.

Repeated complex jumbled code.

Replicate Twilio Mark messages in websocket implementation

https://www.twilio.com/docs/voice/media-streams/websocket-messages#send-a-mark-message Is the way twilio synchronizes the audio pipeline.

Waiting for audio to be played also interruption thread.

Where should the buffering happen? When it is on device, buffering happens in BaseMouth. When it is on web, it should happen at the client. When it is on call...

Fix audio handling

There should be an audio handler that handles audio input and output. Having separate listener and player don't make sense. Also listening, listening for interruption should be states instead of...