client-sdk-swift
client-sdk-swift copied to clipboard
Huge memory leaks produced by publishing/unpublishing camera tracks
Describe the bug I encountered huge memory leaks (around 50MB) when I attempted to publish -> unpublish camera tracks several times.
SDK Version 2.0.11
iOS Version Tested on iOS 17 and iOS 15
Xcode Version Version 15.4 (15F31d) Swift version: Apple Swift version 5.10
Steps to Reproduce
Demo Application
I created a demo application (two simple screens) where the microphone and camera are enabled/disabled by publishing/unpublishing new tracks. I used this approach and not the canonical one (with track unmute/mute) to highlight the memory leaks problem. The problem can be replicated by turning the camera on/off multiple times. On every camera switch the memory will increase significantly.
Demo Repository: https://github.com/VatamanuBogdan/livekit.playground/tree/master Commit: d1c2b043d375a44a63e0df373e531e9dc02cefcd
Steps:
- Login into your room using the form found on the Login Screen (you can hardcode the room URL and Token by modifying the values of defaultServerURL and defaultToken constants that can be found at LiveKit-Playground/Utils/Constants.swift )
- After you log in with success the Conference Screen will appear and then you should enable -> disable camera by using the Camera toggle button
- On every enable -> disable repetition you will see that memory increases drastically see
The entire camera-enabling logic is placed inside ConferenceScreenViewModel.setCamera method.
Code
- Initialize a room with the following connect and room options
ConnectOptions
let connectOptions = ConnectOptions(autoSubscribe: false, protocolVersion: .v9)
RoomOptions
let cameraCaptureOptions = CameraCaptureOptions()
let audioCaptureOptions = AudioCaptureOptions()
let videoPublishOptions = VideoPublishOptions(encoding: VideoEncoding(maxBitrate: 2_000_000, maxFps: 30),
simulcast: false,
preferredCodec: VideoCodec.vp8)
let audioPublishOptions = AudioPublishOptions()
let roomOptions = RoomOptions(
defaultCameraCaptureOptions: cameraCaptureOptions,
defaultAudioCaptureOptions: audioCaptureOptions,
defaultVideoPublishOptions: videoPublishOptions,
defaultAudioPublishOptions: audioPublishOptions,
adaptiveStream: false,
dynacast: false,
reportRemoteTrackStatistics: true
)
- Connect to room
let room = try await room.connect(url: <Server URL>,
token: <Token>,
connectOptions: connectOptions,
roomOptions: roomOptions)
- Create and publish a new camera track
let videoTrack = LocalVideoTrack.createCameraTrack()
try await localParticipant.publish(videoTrack: videoTrack)
- Unpublish camera track
let cameraPublication = room.localParticipant.firstCameraPublication as! LocalTrackPublication
try await self.localParticipant.unpublish(publication: cameraPublication)
- Repeat steps 2 and then 3 several times and you will see that memory increases with every repetition
An important thing to notice is that the following room's option changes increased the amount of leaked memory
from 3 - 4MB to 50MB: https://github.com/VatamanuBogdan/livekit.playground/commit/d16c9375ef187dda15852dc51602ded5e5b28915
Expected behavior
After the camera track unpublish the memory should decrease but it remains the same.
Screenshots
Application Screens
| Login Screen | Conference Screen |
|---|---|
Memory Usage
Every memory increase represents a camera enabling.
Thanks for the detailed report, will investigate this.
Hello I do see huge memory leaks, I did a quick investigation but I can't find where at the moment.
Hello @hiroshihorie! I tried to investigate the source of the memory leaks but I did not find a fix but I noticed something interesting.
The memory leaks are caused by the LKRTCRtpTransceiver media channels that are not closed until the LKRTCPeerConnection is also closed. For several publish -> unpublish operations the transceivers created for every published track don’t get removed from LKRTCPeerConnection by using Transport.remove(track:) method.
You can check this using my demo application (the one that I left in the first comment), switching the video on -> off multiple times, and then watching the number of the sender when you disconnect from the Room. You will see that the sender's count is equal with count of the publishments that you have done.
This happens because the Transport.remove(track:) method only removes the track and switches the transceiver to the inactive state (but they remain attached to the LKRTCPeerConnection and their media channel are still opened).
I tried to solve the problem by stopping the transceiver when the track is removed removal.
extension Transport {
...
func remove(track sender: LKRTCRtpSender) throws {
guard let transceiver = _pc.transceivers.first(where: { $0.sender == sender }) else {
throw LiveKitError(.webRTC, message: "Invalid track")
}
transceiver.sender.track = nil
transceiver.stopInternal()
guard _pc.removeTrack(sender) else {
throw LiveKitError(.webRTC, message: "Failed to remove track")
}
}
...
}
The transceiver is removed from the peer connection and there are no more memory leaks.
The problem is that a race condition appears and the application process is interrupted by a SIGABRT signal because the transceiver is destroyed before its media channel is closed.
From the official WebRTC implementation and from what I read from the internet, transceiver persistence over the peer connection is something intentional on Unified Plan.
See:
- discussion: https://groups.google.com/g/discuss-webrtc/c/WDsGuVucBjQ
- Implementation: https://webrtc.googlesource.com/src/+/refs/heads/main/pc/peer_connection.cc#1025
I tested and the same problem happens also on the Android client.
A fast workaround that can be applied right now would be to reuse the inactive transceivers.
This way the number of memory leaks would decrease.
Hello @VatamanuBogdan, Did you find any workaround for it? I noticed even if we get disconnected from the room, some of the memory is still getting retained
Hi @Negi-Rohit! Unfortunately, I didn't find any workaround.
We are hitting this as well. It's especially problematic since we reuse LocalVideoTrack between multiple rooms and we can't use mute/unmute since it is not reliable when publishing to multiple rooms.
Yes, it still replicates every time. The issue on the Android side was recently closed by AI: https://github.com/livekit/client-sdk-android/issues/521 🤦
Thanks @VatamanuBogdan for your detailed investigation.
I'll prioritize to fix this issue.
We are also planning to release SDK with WebRTC M137 soon.
Thanks @VatamanuBogdan for your detailed investigation.
I'll prioritize to fix this issue.
We are also planning to release SDK with WebRTC M137 soon.
Awesome news! We thank you!
I can confirm this is still reproducible with m137, will take a deeper look 👍
This is more or less minimal (and acceptable in both plan b/unified semantics) fix https://github.com/livekit/client-sdk-swift/pull/751/files
However, @VatamanuBogdan I'm unable to reproduce your race condition - can you provide a full stack trace (based on the PR)?
Thank you for addressing this! I see the implemented solution is specific to the iOS SDK. Can someone please reopen the Android issue: https://github.com/livekit/client-sdk-android/issues/521?
Reopening the issue as we decided to restrict the workaround to video tracks (where it makes the difference) due to race condition (?) in the audio stack. We had some prod feedback with a crash similar to the mentioned above ⏫ Added a little webrtc patch to prevent it internally as well.
Doing naive reuse is a serious architectural change (on webrtc side), so probably not an option with multiple pcs.
2.7.2 contains the revert commit.