livekit icon indicating copy to clipboard operation
livekit copied to clipboard

H.264 codec negotiation between Firefox and livekit-server >= v1.5.1 leads to reconnection loop and failed publication

Open Philzen opened this issue 10 months ago • 3 comments

This took us a couple of days to pin down, but alas, here you go with the complete investigation results:

Description

As per the title, using Firefox to join a room on livekit-server v1.5.1+ with server-side codec setting fixed to H264 will fail to publish any camera media track. It then goes into a series of reconnection attempts to the room, several times producing cannot send signal request before connected, type: trickle in the client console.
We could reproduce this on https://meet.livekit.io/ using our server token to rule out it is our frontend implementation.
We also recently migrated the client & server SDKs in our application from v1 to v2 and can confirm that this does not make any difference. The issue transpires to be related to livekit-server configuration and version only.

A few weeks ago we added the following section to our livekit-server configuration:

room:
  enabled_codecs:
  - mime: video/h264 
  - mime: audio/opus

The intention was to narrow the available video codecs down to H.264 which is the required output format for our application anyway (thus also hoping to save egress the transcoding efforts) – which should be fine, as last time we checked, all clients should be pretty able to publish using this codec.

The livekit-server config.yaml sample is rather unclear in regards to what are the defaults here: https://github.com/livekit/livekit/blob/master/config-sample.yaml#L180-L182 – from looking at this we were assuming h264 was not even enabled per default (which after some testing we now understand it is)

While in the process of testing a newer iteration of our application, we realized that egress recordings were all-audio only. When joining the room w/o an audio track enabled Firefox wouldn't even connect to the room at all, failing with WebRTC: ICE failed, add a TURN server and see about:webrtc for more details – the error still seen the current screenshots after a couple of failed reconnect tries (and also rather misleading as we do have TURN configured on a separate subdomain behind CaddyL4).
We then tried turning every stone and eventually reverted a change to the room initialization flow, which we just had refactored to resemble the one found in the meet.livekit.io example code, which enables cam & mic on RoomEvent.SignalConnected. Having reverted this (cam & mic enabling now happening only after room.connect()) joining rooms seemed fine again, but still the backend did not see any video publication.

After some more investigation we realized this is a Firefox-only issue and also depended on the livekit-server version:

LK egress Firefox Chromium
1.4.3 1.7.13 ¹ :heavy_check_mark: | :heavy_check_mark: :heavy_check_mark: | :heavy_check_mark:
~1.4.3~ ~1.9.0 ³~ ( :heavy_check_mark: | :x: ²) ( :heavy_check_mark: | :x: ² )
1.4.5 1.7.5 :heavy_check_mark: | :x: 5 :heavy_check_mark: | :x: 5
1.4.5 1.7.13 ¹ :heavy_check_mark: | :heavy_check_mark: 6 :heavy_check_mark: | :heavy_check_mark: 6
1.5.0 1.7.5 :heavy_check_mark: | :heavy_check_mark: :heavy_check_mark: | :heavy_check_mark:
1.5.0 1.7.13 ¹ :heavy_check_mark: | :heavy_check_mark: :heavy_check_mark: | :heavy_check_mark:
~1.5.0~ ~1.9.0 ³~ ( :heavy_check_mark: | :x: ² ) ( :heavy_check_mark: | :x: ² )
1.5.1 1.7.5 :x: | :x: 4 :heavy_check_mark: | :heavy_check_mark:
1.7.2 1.7.5 :x: | :x: 4 :heavy_check_mark: | :heavy_check_mark:
1.8.3 1.9.0 ³ :x: | :x: 4 :heavy_check_mark: | :heavy_check_mark:
  1. Egress service needs write access for user livekit-cli
  2. Egress >= 1.8 requires at least livekit-server 1.5.1
  3. Egress service needs write access for user 1001 (user in docker container)
  4. Mic-only join & recording (audio-only) works, but camera publication problem causes reconnection loop, also cannot open audio track then → egress therefore also cannot find anything to record (see screenshot)
  5. Egress generates a 0kB file and no accompanying .json |
  6. Egress sometimes appears to be creating a 0kB file, but after recording ended flushes the content after a while and the recording seems OK

One interesting detail is that before version 1.5.1 it seems that enabled_codecs did not have any actual effect. Both Chrome or Firefox publish VP8 instead of H.264, with both livekit-server and egress accepting and processing it fine.

Environment

Server

  • Version: >= 1.5.1
  • Environment: running in Docker 24.0.7 on Linux 5.4.0 (Ubuntu 20.04.6 LTS) – also confirmed using another server that run v1.8.3 on Kubernetes pods

Client

Confirmed with all these versions:

  • livekit-server-sdk 1.2.7 + livekit-client 1.15.13
  • livekit-server-sdk 2.9.7 + livekit-client 2.8.1
  • Firefox 134.0.2
  • Firefox 135.0
  • OS: 6.12.11-1-MANJARO

To Reproduce

Steps to reproduce the behavior:

  1. livekit-server >= 1.5.1 with room.enabled_codecs configured with mime: video/h264 as the only video codec
  2. join the room using Firefox and try to publish a video
  3. See error

Screenshots

On trying to publish video in Firefox:

Image

A similar trace could be produced on meet.livekit.io:

meet.livekit.io Firefox Console screenshots

Image Image Image Image Image

Below a comparison of what the video publication in room.localParticipant.tracks show on chrome vs. firefox in different versions:

livekit-server 1.4.5

Firefox:

Image

Chrome:

Image

livekit-server 1.5.0

Firefox:

Image

Chrome:

Image

livekit-server 1.8.3

Firefox:

Image

Chrome:

Image

Philzen avatar Feb 05 '25 23:02 Philzen

It looks like the codec fallback to vp8 by the h264 profile mismatched, can you save the sdp from the about:webrtc tab of firefox?

cnderrauber avatar Feb 06 '25 02:02 cnderrauber

It looks like the codec fallback to vp8 by the h264 profile mismatched

there shouldn't be a reason to fallback right? H.264 should be supported

davidzhao avatar Feb 06 '25 05:02 davidzhao

It looks like the codec fallback to vp8 by the h264 profile mismatched

there shouldn't be a reason to fallback right? H.264 should be supported

If the H.264 is partially matched (profile level is not included in the sfu side) then vp8 will be put before h264 since it is exactly matched

cnderrauber avatar Feb 20 '25 14:02 cnderrauber