WhisperFusion
WhisperFusion copied to clipboard
Indentation Bug in `trt_server.py`
In the file trt_server.py I suspect that the highlighted lines need to be in the same indentation level as the while
loop. Otherwise, in its current form it makes no sense to me. Just shining some light on this.
Not really, because we want to only send one response to the client, at some point we were sending all the responses we add to the llm_queue
for all updates in the current segment from whisper-live but then we decided to send only the one which corresponds to the transcription with eos=True
.
That said, https://github.com/collabora/WhisperFusion/blob/main/whisper_live/trt_server.py#L340-L343 this if could be at the same level as the outer if and everything should be fine.
Thanks for your reply! I understand the logic to only send those responses with eos
. But could there not be a backlog in the llm_queue
such that there are multiple sentences. Where the first one has an EOS and then begins the other with its own EOS. In the current implementation, the first sentence would be lost.
@DamianB-BitFlipper not sure i understand what you mean when you say sentences
, there are llm_response
which could be multiple sentences or a single word.
In the current implementation, the first sentence would be lost.
Can you please give an example if you have seen this?
I wouldn't expect to see this in most cases in practice because the llm_response
queue would empty rather quickly. I am just postulating, from exploring the code and poking at it, that the transcriber sends: [<first sentence, eos=True>, <second sentence here, eos=True>], the way the code is written, the first sentence is lost.
I am aware that the transcriber does not put eos=True at the end of sentences, but rather at prolonged pauses of non-voice input. I am using sentence here as an example purely.
I am just postulating, from exploring the code and poking at it, that the transcriber sends: [<first sentence, eos=True>, <second sentence here, eos=True>], the way the code is written, the first sentence is lost.
@DamianB-BitFlipper Okay, so we should never reach this state for a short exchange conversation i.e. we transcribe until EOS=true and at that time the llm_queue
should be
[{output1, eos=False}, ..., {outputn, eos=True}]
We only care about outputn
at this moment, because that is the most recent llm_output corresponding to the most updated transcription. Not sure why you would want output1