pipecat icon indicating copy to clipboard operation
pipecat copied to clipboard

SmallWebRTC: unbounded queuing with disabled video track

Open martinxsliu opened this issue 1 month ago • 1 comments

pipecat version

0.0.94

Python version

3.12

Operating System

macOS

Issue description

Hi team, first of all, thanks for the great framework!

I am reviewing the Python SmallWebRTC transport and I think I may have found a potential memory leak when using video, but the local video track has been disabled.

Looking at the SmallWebRTCTrack class (link):

  1. SmallWebRTCTrack.recv is called in a loop when the input transport starts.
    1. SmallWebRTCTrack.recv enables its RTCRtpReceiver.
    2. The receiver's _handle_rtp_packet method (link) starts processing packets and places processed frames into a decoder queue.
    3. The receiver's decoder thread decodes the frames and places them into aiortc's RemoteStreamTrack's internal asyncio.Queue. This queue is unlimited in size.
    4. Calling RemoteStreamTrack.recv will pull an item from its queue.
  2. SmallWebRTCTrack.recv also starts the idle watcher.
    1. Each SmallWebRTCTrack.recv call updates the last recv timestamp.
    2. The idle watcher checks the last recv timestamp, and only if the last recv timestamp is old will it purge the remote track's queue. Meaning, as long as video is streaming over the network, the idle watcher should not fire.
  3. Finally, if SmallWebRTCTrack is for video and is disabled, then it early exits. Otherwise, it calls RemoteStreamTrack.recv.

Therefore, if the local video track is disabled, then the RTP receiver will continuously put decoded video frames into the remote track's unlimited queue, causing unbounded memory growth.

Perhaps an alternative implementation is to always call recv from the remote track and optionally discard the frame afterwards?

        # Always pull from the remote track's queue to prevent unbounded growth.
        frame = await self._track.recv()
        if not self._enabled:
            return None
        return frame

Aside: is there a reason SmallWebRTCTrack.recv has this conditional check for video and not for audio?

Reproduction steps

  1. Establish a webrtc connection using the SmallWebRTC transport and with video inputs enabled.
  2. Disable the local video SmallWebRTCTrack.

Expected behavior

No unbounded memory growth.

Actual behavior

Unbounded memory growth.

Logs


martinxsliu avatar Nov 16 '25 19:11 martinxsliu

Thanks for the detailed report. 🙇

Tagging @filipi87 to take a look when he gets a moment.

markbackman avatar Nov 16 '25 22:11 markbackman