TTSSpeakFrame can be cancelled by a subsequent EndFrame
pipecat version
0.0.67
Python version
3.12
Operating System
macOS 14.6.1
Issue description
If we have a transport.input() in our pipeline before tts, and we queue both TTSSpeakFrame and EndFrame, the transport closes before the TTS has had a chance to play the audio.
Reproduction steps
The issue can be replicated by running examples/foundational/17-detect-user-idle.py. From logs, we can see that the Small WebRTC processes the EndFrame and closes the connection before the Audio is played using transport.output
2025-05-09 20:35:55.127 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [It seems like you're busy right now. Have a nice day!]
2025-05-09 20:35:55.143 | DEBUG | pipecat.services.ai_service:process_frame:97 - processing end frame
2025-05-09 20:35:55.143 | DEBUG | pipecat.services.deepgram.stt:_disconnect:167 - Disconnecting from Deepgram
2025-05-09 20:35:55.143 | INFO | pipecat.transports.network.small_webrtc:disconnect:313 - Disconnecting to Small WebRTC
2025-05-09 20:35:55.144 | DEBUG | pipecat.transports.network.webrtc_connection:on_iceconnectionstatechange:187 - ICE connection state is closed, connection is closed
2025-05-09 20:35:55.144 | DEBUG | pipecat.transports.network.webrtc_connection:_handle_new_connection_state:313 - Connection state changed to: closed
2025-05-09 20:35:55.144 | INFO | run:handle_disconnected:121 - Discarding peer connection for pc_id: SmallWebRTCConnection#0
2025-05-09 20:35:55.144 | DEBUG | pipecat.transports.network.small_webrtc:on_closed:176 - Client connection closed.
2025-05-09 20:35:55.144 | INFO | 17-detect-user-idle:on_client_closed:126 - Client closed connection
2025-05-09 20:35:55.145 | DEBUG | pipecat.pipeline.task:cancel:288 - Canceling pipeline task PipelineTask#0
2025-05-09 20:35:55.145 | INFO | 17-detect-user-idle:on_client_disconnected:122 - Client disconnected
Expected behavior
The audio generated by TTS should finish playing before the transport closes.
Actual behavior
The transport closes before the TTS could finish playing the audio.
Logs
(pipecat) (base) abhishek$ python examples/foundational/17-detect-user-idle.py
2025-05-09 20:35:12.658 | INFO | pipecat:<module>:13 - ᓚᘏᗢ Pipecat 0.0.68.dev6 ᓚᘏᗢ
Looking for dist directory at: /Users/abhishekkumar/miniconda3/envs/pipecat/lib/python3.13/site-packages/pipecat_ai_small_webrtc_prebuilt/client/dist
2025-05-09 20:35:14.396 | INFO | run:main:191 - Successfully loaded bot from /Users/abhishekkumar/Projects/pipecat/examples/foundational/17-detect-user-idle.py
2025-05-09 20:35:14.396 | INFO | run:main:194 - Detected WebRTC-compatible bot, starting web server...
INFO: Started server process [45726]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://localhost:7860 (Press CTRL+C to quit)
INFO: ::1:53649 - "GET /client/ HTTP/1.1" 304 Not Modified
2025-05-09 20:35:20.177 | DEBUG | pipecat.transports.network.webrtc_connection:_initialize:136 - Initializing new peer connection
2025-05-09 20:35:20.193 | DEBUG | pipecat.transports.network.webrtc_connection:_create_answer:214 - Creating answer
2025-05-09 20:35:20.193 | DEBUG | pipecat.transports.network.webrtc_connection:on_track:197 - Track audio received
2025-05-09 20:35:20.193 | DEBUG | pipecat.transports.network.webrtc_connection:on_track:197 - Track video received
2025-05-09 20:35:20.194 | DEBUG | pipecat.transports.network.webrtc_connection:on_icegatheringstatechange:193 - ICE gathering state is gathering
2025-05-09 20:35:20.213 | DEBUG | pipecat.transports.network.webrtc_connection:on_icegatheringstatechange:193 - ICE gathering state is complete
2025-05-09 20:35:20.213 | DEBUG | pipecat.transports.network.webrtc_connection:_create_answer:217 - Setting the answer after the local description is created
INFO: ::1:53649 - "POST /api/offer HTTP/1.1" 200 OK
2025-05-09 20:35:20.213 | INFO | 17-detect-user-idle:run_bot:31 - Starting bot
2025-05-09 20:35:20.213 | DEBUG | pipecat.audio.vad.silero:__init__:111 - Loading Silero VAD model...
2025-05-09 20:35:20.297 | DEBUG | pipecat.audio.vad.silero:__init__:133 - Loaded Silero VAD
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking PipelineSource#0 -> SmallWebRTCInputTransport#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking SmallWebRTCInputTransport#0 -> DeepgramSTTService#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking DeepgramSTTService#0 -> UserIdleProcessor#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking UserIdleProcessor#0 -> OpenAIUserContextAggregator#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking OpenAIUserContextAggregator#0 -> OpenAILLMService#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking OpenAILLMService#0 -> DeepgramTTSService#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking DeepgramTTSService#0 -> SmallWebRTCOutputTransport#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking SmallWebRTCOutputTransport#0 -> OpenAIAssistantContextAggregator#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking OpenAIAssistantContextAggregator#0 -> PipelineSink#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking PipelineTaskSource#0 -> Pipeline#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.processors.frame_processor:link:178 - Linking Pipeline#0 -> PipelineTaskSink#0
2025-05-09 20:35:20.333 | DEBUG | pipecat.pipeline.runner:run:38 - Runner PipelineRunner#0 started running PipelineTask#0
2025-05-09 20:35:20.334 | DEBUG | pipecat.services.deepgram.stt:_connect:143 - Connecting to Deepgram
2025-05-09 20:35:20.350 | DEBUG | pipecat.transports.network.webrtc_connection:on_iceconnectionstatechange:187 - ICE connection state is checking, connection is connecting
2025-05-09 20:35:20.350 | DEBUG | pipecat.transports.network.webrtc_connection:_handle_new_connection_state:313 - Connection state changed to: connecting
2025-05-09 20:35:20.361 | DEBUG | pipecat.transports.network.webrtc_connection:on_iceconnectionstatechange:187 - ICE connection state is completed, connection is connecting
2025-05-09 20:35:20.365 | DEBUG | pipecat.transports.network.webrtc_connection:_handle_signalling_message:387 - Signalling message received: {'type': 'trackStatus', 'receiver_index': 0, 'enabled': True}
2025-05-09 20:35:20.365 | DEBUG | pipecat.transports.network.webrtc_connection:_handle_signalling_message:387 - Signalling message received: {'type': 'trackStatus', 'receiver_index': 1, 'enabled': False}
2025-05-09 20:35:20.365 | DEBUG | pipecat.transports.network.small_webrtc:push_app_message:453 - Received app message inside SmallWebRTCInputTransport {'label': 'rtvi-ai', 'type': 'client-ready', 'data': {}, 'id': '9135a26d'}
2025-05-09 20:35:20.365 | ERROR | pipecat.processors.frame_processor:_check_ready:330 - UserIdleProcessor#0 not properly initialized, missing super().process_frame(frame, direction)?
2025-05-09 20:35:21.285 | INFO | pipecat.transports.network.small_webrtc:connect:308 - Connecting to Small WebRTC
2025-05-09 20:35:21.285 | DEBUG | pipecat.transports.network.webrtc_connection:discard_old_frames:75 - Discarding old frames
2025-05-09 20:35:21.286 | INFO | pipecat.audio.vad.vad_analyzer:set_params:74 - Setting VAD params to: confidence=0.7 start_secs=0.2 stop_secs=0.8 min_volume=0.6
2025-05-09 20:35:21.286 | DEBUG | pipecat.pipeline.task:_process_push_queue:449 - in running processing frame
2025-05-09 20:35:21.286 | DEBUG | pipecat.transports.network.small_webrtc:on_connected:166 - Peer connection established.
2025-05-09 20:35:21.286 | DEBUG | pipecat.transports.network.webrtc_connection:replace_audio_track:267 - Replacing audio track audio
2025-05-09 20:35:21.291 | INFO | 17-detect-user-idle:on_client_connected:115 - Client connected
2025-05-09 20:35:21.291 | DEBUG | pipecat.pipeline.task:_process_push_queue:449 - in running processing frame
2025-05-09 20:35:21.292 | DEBUG | pipecat.services.openai.base_llm:_stream_chat_completions:156 - OpenAILLMService#0: Generating chat [[{"role": "system", "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way."}, {"role": "system", "content": "Please introduce yourself to the user."}]]
2025-05-09 20:35:21.302 | INFO | run:offer:111 - Reusing existing connection for pc_id: SmallWebRTCConnection#0
2025-05-09 20:35:21.303 | DEBUG | pipecat.transports.network.webrtc_connection:renegotiate:236 - Renegotiating SmallWebRTCConnection#0
2025-05-09 20:35:21.307 | DEBUG | pipecat.transports.network.webrtc_connection:_create_answer:214 - Creating answer
2025-05-09 20:35:21.307 | DEBUG | pipecat.transports.network.webrtc_connection:_create_answer:217 - Setting the answer after the local description is created
INFO: ::1:53649 - "POST /api/offer HTTP/1.1" 200 OK
2025-05-09 20:35:22.216 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_ttfb_metrics:50 - OpenAILLMService#0 TTFB: 0.923882007598877
2025-05-09 20:35:22.255 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [Hello.]
2025-05-09 20:35:22.862 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - OpenAILLMService#0 processing time: 1.5700469017028809
2025-05-09 20:35:23.544 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_ttfb_metrics:50 - DeepgramTTSService#0 TTFB: 1.2897131443023682
2025-05-09 20:35:23.545 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramTTSService#0 processing time: 1.2899930477142334
2025-05-09 20:35:23.545 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [ I am your friendly AI assistant, ready to answer questions, explain ideas, or help you solve problems.]
2025-05-09 20:35:23.561 | DEBUG | pipecat.transports.base_output:_bot_started_speaking:385 - Bot started speaking
2025-05-09 20:35:24.617 | DEBUG | pipecat.transports.base_output:_bot_stopped_speaking:400 - Bot stopped speaking
2025-05-09 20:35:26.713 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramTTSService#0 processing time: 3.1680309772491455
2025-05-09 20:35:26.713 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [ Just let me know what you need.]
2025-05-09 20:35:26.731 | DEBUG | pipecat.transports.base_output:_bot_started_speaking:385 - Bot started speaking
2025-05-09 20:35:28.259 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramTTSService#0 processing time: 1.5458431243896484
2025-05-09 20:35:28.261 | DEBUG | pipecat.transports.base_input:_handle_user_interruption:214 - User started speaking
2025-05-09 20:35:28.261 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_ttfb_metrics:50 - DeepgramSTTService#0 TTFB: 0.0002357959747314453
2025-05-09 20:35:28.261 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramSTTService#0 processing time: 0.0002942085266113281
2025-05-09 20:35:28.262 | DEBUG | pipecat.transports.base_output:_bot_stopped_speaking:400 - Bot stopped speaking
2025-05-09 20:35:29.060 | DEBUG | pipecat.transports.base_input:_handle_user_interruption:224 - User stopped speaking
2025-05-09 20:35:34.061 | DEBUG | pipecat.services.openai.base_llm:_stream_chat_completions:156 - OpenAILLMService#0: Generating chat [[{"role": "system", "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.", "name": "system"}, {"role": "system", "content": "Please introduce yourself to the user.", "name": "system"}, {"role": "assistant", "content": "Hello.", "name": "assistant"}, {"role": "system", "content": "The user has been quiet. Politely and briefly ask if they're still there.", "name": "system"}]]
2025-05-09 20:35:34.943 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [Hi, just checking in.]
2025-05-09 20:35:34.987 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - OpenAILLMService#0 processing time: 0.9259238243103027
2025-05-09 20:35:36.522 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramTTSService#0 processing time: 1.578566312789917
2025-05-09 20:35:36.522 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [ Are you still there?]
2025-05-09 20:35:36.540 | DEBUG | pipecat.transports.base_output:_bot_started_speaking:385 - Bot started speaking
2025-05-09 20:35:37.853 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramTTSService#0 processing time: 1.3309423923492432
2025-05-09 20:35:39.797 | DEBUG | pipecat.transports.base_output:_bot_stopped_speaking:400 - Bot stopped speaking
2025-05-09 20:35:44.326 | DEBUG | pipecat.services.openai.base_llm:_stream_chat_completions:156 - OpenAILLMService#0: Generating chat [[{"role": "system", "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.", "name": "system"}, {"role": "system", "content": "Please introduce yourself to the user.", "name": "system"}, {"role": "assistant", "content": "Hello.", "name": "assistant"}, {"role": "system", "content": "The user has been quiet. Politely and briefly ask if they're still there.", "name": "system"}, {"role": "assistant", "content": "Hi, just checking in. Are you still there?", "name": "assistant"}, {"role": "system", "content": "The user is still inactive. Ask if they'd like to continue our conversation.", "name": "system"}]]
2025-05-09 20:35:45.042 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [Would you like to continue our conversation?]
2025-05-09 20:35:45.192 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - OpenAILLMService#0 processing time: 0.865253210067749
2025-05-09 20:35:46.438 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramTTSService#0 processing time: 1.3959412574768066
2025-05-09 20:35:46.438 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [ I am here whenever you are ready.]
2025-05-09 20:35:46.455 | DEBUG | pipecat.transports.base_output:_bot_started_speaking:385 - Bot started speaking
2025-05-09 20:35:47.840 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramTTSService#0 processing time: 1.402176856994629
2025-05-09 20:35:50.637 | DEBUG | pipecat.transports.base_output:_bot_stopped_speaking:400 - Bot stopped speaking
2025-05-09 20:35:55.127 | DEBUG | pipecat.services.deepgram.tts:run_tts:53 - DeepgramTTSService#0: Generating TTS [It seems like you're busy right now. Have a nice day!]
2025-05-09 20:35:55.143 | DEBUG | pipecat.services.ai_service:process_frame:97 - processing end frame
2025-05-09 20:35:55.143 | DEBUG | pipecat.services.deepgram.stt:_disconnect:167 - Disconnecting from Deepgram
2025-05-09 20:35:55.143 | INFO | pipecat.transports.network.small_webrtc:disconnect:313 - Disconnecting to Small WebRTC
2025-05-09 20:35:55.144 | DEBUG | pipecat.transports.network.webrtc_connection:on_iceconnectionstatechange:187 - ICE connection state is closed, connection is closed
2025-05-09 20:35:55.144 | DEBUG | pipecat.transports.network.webrtc_connection:_handle_new_connection_state:313 - Connection state changed to: closed
2025-05-09 20:35:55.144 | INFO | run:handle_disconnected:121 - Discarding peer connection for pc_id: SmallWebRTCConnection#0
2025-05-09 20:35:55.144 | DEBUG | pipecat.transports.network.small_webrtc:on_closed:176 - Client connection closed.
2025-05-09 20:35:55.144 | INFO | 17-detect-user-idle:on_client_closed:126 - Client closed connection
2025-05-09 20:35:55.145 | DEBUG | pipecat.pipeline.task:cancel:288 - Canceling pipeline task PipelineTask#0
2025-05-09 20:35:55.145 | INFO | 17-detect-user-idle:on_client_disconnected:122 - Client disconnected
tasks cancelled error:
ERROR:deepgram.clients.common.v1.abstract_async_websocket:tasks cancelled error:
2025-05-09 20:35:55.145 | DEBUG | pipecat.services.ai_service:process_frame:97 - processing end frame
2025-05-09 20:35:57.086 | DEBUG | pipecat.processors.metrics.frame_processor_metrics:stop_processing_metrics:65 - DeepgramTTSService#0 processing time: 1.9588260650634766
2025-05-09 20:35:57.086 | DEBUG | pipecat.services.ai_service:process_frame:97 - processing end frame
2025-05-09 20:35:57.086 | DEBUG | pipecat.services.tts_service:stop:136 - processing end frame and stopping
2025-05-09 20:35:57.087 | DEBUG | pipecat.transports.base_output:_bot_started_speaking:385 - Bot started speaking
2025-05-09 20:40:57.088 | WARNING | pipecat.pipeline.task:_idle_timeout_detected:580 - Idle pipeline detected, cancelling pipeline task...
2025-05-09 20:40:57.088 | DEBUG | pipecat.pipeline.task:cancel:288 - Canceling pipeline task PipelineTask#0
2025-05-09 20:40:57.089 | DEBUG | pipecat.services.deepgram.stt:_disconnect:167 - Disconnecting from Deepgram
2025-05-09 20:40:57.090 | DEBUG | pipecat.pipeline.runner:run:55 - Runner PipelineRunner#0 finished running PipelineTask#0
Having the same issue. Any solution?
Is this also happen on other transport? Or just the SmallWebRTC. I think this behavior only relates to how the TTS is implemented.
I can repro this. It's unrelated to the transport. We'll take a look.
Hello @aconchillo - Happy to help with this task to create a first draft. Do you have some ideas in terms of how to approach this which can help me with implementation. Thanks!