client-sdk-swift icon indicating copy to clipboard operation
client-sdk-swift copied to clipboard

Huge memory leaks produced by publishing/unpublishing camera tracks

Open VatamanuBogdan opened this issue 1 year ago • 3 comments

Describe the bug I encountered huge memory leaks (around 50MB) when I attempted to publish -> unpublish camera tracks several times.

SDK Version 2.0.11

iOS Version Tested on iOS 17 and iOS 15

Xcode Version Version 15.4 (15F31d) Swift version: Apple Swift version 5.10

Steps to Reproduce

Demo Application

I created a demo application (two simple screens) where the microphone and camera are enabled/disabled by publishing/unpublishing new tracks. I used this approach and not the canonical one (with track unmute/mute) to highlight the memory leaks problem. The problem can be replicated by turning the camera on/off multiple times. On every camera switch the memory will increase significantly.

Demo Repository: https://github.com/VatamanuBogdan/livekit.playground/tree/master Commit: d1c2b043d375a44a63e0df373e531e9dc02cefcd

Steps:

  1. Login into your room using the form found on the Login Screen (you can hardcode the room URL and Token by modifying the values of defaultServerURL and defaultToken constants that can be found at LiveKit-Playground/Utils/Constants.swift )
  2. After you log in with success the Conference Screen will appear and then you should enable -> disable camera by using the Camera toggle button
  3. On every enable -> disable repetition you will see that memory increases drastically see

The entire camera-enabling logic is placed inside ConferenceScreenViewModel.setCamera method.

Code

  1. Initialize a room with the following connect and room options

ConnectOptions

let connectOptions = ConnectOptions(autoSubscribe: false, protocolVersion: .v9)

RoomOptions

let cameraCaptureOptions = CameraCaptureOptions()
let audioCaptureOptions = AudioCaptureOptions()
let videoPublishOptions = VideoPublishOptions(encoding: VideoEncoding(maxBitrate: 2_000_000, maxFps: 30),
                                                                                  simulcast: false,
                                                                                  preferredCodec: VideoCodec.vp8)
let audioPublishOptions = AudioPublishOptions()

let roomOptions = RoomOptions(
          defaultCameraCaptureOptions: cameraCaptureOptions,
          defaultAudioCaptureOptions: audioCaptureOptions,
          defaultVideoPublishOptions: videoPublishOptions,
          defaultAudioPublishOptions: audioPublishOptions,
          adaptiveStream: false,
          dynacast: false,
          reportRemoteTrackStatistics: true
  )
  1. Connect to room
let room = try await room.connect(url: <Server URL>,
                                                          token: <Token>,
                                                          connectOptions: connectOptions,
                                                          roomOptions: roomOptions)
  1. Create and publish a new camera track
let videoTrack = LocalVideoTrack.createCameraTrack()
try await localParticipant.publish(videoTrack: videoTrack)
  1. Unpublish camera track
let cameraPublication = room.localParticipant.firstCameraPublication as! LocalTrackPublication
try await self.localParticipant.unpublish(publication: cameraPublication)
  1. Repeat steps 2 and then 3 several times and you will see that memory increases with every repetition

An important thing to notice is that the following room's option changes increased the amount of leaked memory
from 3 - 4MB to 50MB:
https://github.com/VatamanuBogdan/livekit.playground/commit/d16c9375ef187dda15852dc51602ded5e5b28915

Expected behavior

After the camera track unpublish the memory should decrease but it remains the same.

Screenshots

Application Screens

Login Screen Conference Screen
login screen conference screen

Memory Usage

Memory-Usage-Screenshot

Every memory increase represents a camera enabling.

VatamanuBogdan avatar Jul 02 '24 14:07 VatamanuBogdan

Thanks for the detailed report, will investigate this.

hiroshihorie avatar Jul 05 '24 10:07 hiroshihorie

Hello I do see huge memory leaks, I did a quick investigation but I can't find where at the moment.

hiroshihorie avatar Aug 29 '24 07:08 hiroshihorie

Hello @hiroshihorie! I tried to investigate the source of the memory leaks but I did not find a fix but I noticed something interesting.

The memory leaks are caused by the LKRTCRtpTransceiver media channels that are not closed until the LKRTCPeerConnection is also closed. For several publish -> unpublish operations the transceivers created for every published track don’t get removed from LKRTCPeerConnection by using Transport.remove(track:) method.

You can check this using my demo application (the one that I left in the first comment), switching the video on -> off multiple times, and then watching the number of the sender when you disconnect from the Room. You will see that the sender's count is equal with count of the publishments that you have done. This happens because the Transport.remove(track:) method only removes the track and switches the transceiver to the inactive state (but they remain attached to the LKRTCPeerConnection and their media channel are still opened).

I tried to solve the problem by stopping the transceiver when the track is removed removal.

extension Transport {
    ...

    func remove(track sender: LKRTCRtpSender) throws {
        
        guard let transceiver = _pc.transceivers.first(where: { $0.sender == sender }) else {
            throw LiveKitError(.webRTC, message: "Invalid track")
        }
        
        transceiver.sender.track = nil
        transceiver.stopInternal()
        
        guard _pc.removeTrack(sender) else {
            throw LiveKitError(.webRTC, message: "Failed to remove track")
        }
    }
   ...
}

The transceiver is removed from the peer connection and there are no more memory leaks. The problem is that a race condition appears and the application process is interrupted by a SIGABRT signal because the transceiver is destroyed before its media channel is closed. SIGABRT-Screenshot

From the official WebRTC implementation and from what I read from the internet, transceiver persistence over the peer connection is something intentional on Unified Plan.

See:

  • discussion: https://groups.google.com/g/discuss-webrtc/c/WDsGuVucBjQ
  • Implementation: https://webrtc.googlesource.com/src/+/refs/heads/main/pc/peer_connection.cc#1025

I tested and the same problem happens also on the Android client.

A fast workaround that can be applied right now would be to reuse the inactive transceivers.
This way the number of memory leaks would decrease.

VatamanuBogdan avatar Oct 08 '24 11:10 VatamanuBogdan

Hello @VatamanuBogdan, Did you find any workaround for it? I noticed even if we get disconnected from the room, some of the memory is still getting retained

Negi-Rohit avatar Jan 20 '25 09:01 Negi-Rohit

Hi @Negi-Rohit! Unfortunately, I didn't find any workaround.

VatamanuBogdan avatar Jan 20 '25 15:01 VatamanuBogdan

We are hitting this as well. It's especially problematic since we reuse LocalVideoTrack between multiple rooms and we can't use mute/unmute since it is not reliable when publishing to multiple rooms.

IsaiahJTurner avatar Jun 28 '25 04:06 IsaiahJTurner

Yes, it still replicates every time. The issue on the Android side was recently closed by AI: https://github.com/livekit/client-sdk-android/issues/521 🤦

adrian-niculescu avatar Jul 30 '25 09:07 adrian-niculescu

Thanks @VatamanuBogdan for your detailed investigation.

I'll prioritize to fix this issue.

We are also planning to release SDK with WebRTC M137 soon.

hiroshihorie avatar Jul 31 '25 12:07 hiroshihorie

Thanks @VatamanuBogdan for your detailed investigation.

I'll prioritize to fix this issue.

We are also planning to release SDK with WebRTC M137 soon.

Awesome news! We thank you!

adrian-niculescu avatar Jul 31 '25 12:07 adrian-niculescu

I can confirm this is still reproducible with m137, will take a deeper look 👍

pblazej avatar Aug 06 '25 11:08 pblazej

This is more or less minimal (and acceptable in both plan b/unified semantics) fix https://github.com/livekit/client-sdk-swift/pull/751/files

However, @VatamanuBogdan I'm unable to reproduce your race condition - can you provide a full stack trace (based on the PR)?

pblazej avatar Aug 06 '25 13:08 pblazej

Thank you for addressing this! I see the implemented solution is specific to the iOS SDK. Can someone please reopen the Android issue: https://github.com/livekit/client-sdk-android/issues/521?

adrian-niculescu avatar Aug 07 '25 10:08 adrian-niculescu

Reopening the issue as we decided to restrict the workaround to video tracks (where it makes the difference) due to race condition (?) in the audio stack. We had some prod feedback with a crash similar to the mentioned above ⏫ Added a little webrtc patch to prevent it internally as well.

Doing naive reuse is a serious architectural change (on webrtc side), so probably not an option with multiple pcs.

2.7.2 contains the revert commit.

pblazej avatar Sep 03 '25 06:09 pblazej