Fixing TavusTransport with some TTS services.
Fixing TavusTransport with some TTS services.
Codecov Report
:x: Patch coverage is 0% with 46 lines in your changes missing coverage. Please review.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/pipecat/services/tavus/video.py | 0.00% | 26 Missing :warning: |
| src/pipecat/transports/services/tavus.py | 0.00% | 20 Missing :warning: |
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/pipecat/transports/services/tavus.py | 0.00% <0.00%> (ø) |
|
| src/pipecat/services/tavus/video.py | 0.00% <0.00%> (ø) |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@filipi87 Can you describe the issue we are fixing?
Hi @aconchillo ,
The issue was that we were using the TTSStartedFrame to create the inference ID that we sent to Tavus, which we are calling _current_idx_str in both TavusTransport and TavusVideoService.
So, the problem was that the audio frames, TTSStartedFrame, and TTSStoppedFrames were handled in different queues. Consequently, there were instances where _current_idx_str was updated before all the audio was processed by Tavus. This resulted in only parts of the audio being spoken, typically the beginning of each sentence.
Another issue involved how we calculated the wait time, which sometimes caused the replica to speak the first utterance but then remain muted for an extended period.
Both issues are easily reproducible when using DeepgramTTS or OpenAITTS.
Hi @aconchillo ,
The issue was that we were using the
TTSStartedFrameto create the inference ID that we sent to Tavus, which we are calling_current_idx_strin bothTavusTransportandTavusVideoService.So, the problem was that the audio frames, TTSStartedFrame, and TTSStoppedFrames were handled in different queues. Consequently, there were instances where
_current_idx_strwas updated before all the audio was processed by Tavus. This resulted in only parts of the audio being spoken, typically the beginning of each sentence.Another issue involved how we calculated the wait time, which sometimes caused the replica to speak the first utterance but then remain muted for an extended period.
Both issues are easily reproducible when using
DeepgramTTSorOpenAITTS.
OK! Thank you!