[Bug]: Streaming conversation getting stuck if using InputStreamingSynthesizer
Brief Description
I am trying to use ElevenLabsWSSynthesizer in streaming conversation but facing "conversation stuck" issue. I understand from the code that its marked as "experimental" but I expected the following test to pass.
Can anyone help to me understand why this fails?
Any help greatly appreciated!
LLM
None
Transcription Services
None
Synthesis Services
None
Telephony Services
None
Conversation Type and Platform
This is a test environment for testing real-time streaming conversation using a dummy transcriber and dummy "input streaming" synthesizer.
Steps to Reproduce
- Apply the attached patch file to get the new test
- Run
poetry run pytest -s -v tests/streaming/test_streaming_conversation.py::test_streaming_conversation_pipeline_streaming_synthesizer - Observe assertion error after the conversation times out.
Expected Behavior
The conversation should not timeout and the test should pass.
Screenshots
Following is the output after reproduction step:
tests/streaming/test_streaming_conversation.py::test_streaming_conversation_pipeline_streaming_synthesizer FAILED
========================================================================================== FAILURES ==========================================================================================
_________________________________________________________________ test_streaming_conversation_pipeline_streaming_synthesizer _________________________________________________________________
mocker = <pytest_mock.plugin.MockerFixture object at 0x75efd19189e0>
@pytest.mark.asyncio
async def test_streaming_conversation_pipeline_streaming_synthesizer(
mocker: MockerFixture,
):
"""
Test conversation pipeline using InputStreamingSynthesizer
"""
output_device = DummyOutputDevice(sampling_rate=48000, audio_encoding=AudioEncoding.LINEAR16)
streaming_conversation = StreamingConversation(
output_device=output_device,
transcriber=TestAsyncTranscriber(
TestTranscriberConfig(
sampling_rate=48000,
audio_encoding=AudioEncoding.LINEAR16,
chunk_size=480,
)
),
agent=EchoAgent(
EchoAgentConfig(initial_message=BaseMessage(text="Hi there")),
),
synthesizer=TestSynthesizer(TestSynthesizerConfig.from_output_device(output_device)),
)
await streaming_conversation.start()
await streaming_conversation.initial_message_tracker.wait()
initial_message_audio_chunk = await output_device.dummy_playback_queue.get()
assert initial_message_audio_chunk.data == b"Hi there"
await asyncio.sleep(2)
streaming_conversation.receive_audio(b"who are you?")
first_response_audio_chunk = await output_device.dummy_playback_queue.get()
> assert first_response_audio_chunk.data == b"who are you?"
E AssertionError: assert b'Are you there?' == b'who are you?'
E
E At index 0 diff: b'A' != b'w'
E
E Full diff:
E - (b'who are you?')
E + (b'Are you there?')
tests/streaming/test_streaming_conversation.py:647: AssertionError
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed due to inactivity. Thank you for your contributions.