mediamtx
mediamtx copied to clipboard
H264 RTSP from matrox Mura IPX card plays garbled frames in browser WebRTC when H264 frames have multiple slices
Which version are you using?
v1.4.0
Which operating system are you using?
- [X] Linux amd64 standard
- [ ] Linux amd64 Docker
- [ ] Linux arm64 standard
- [ ] Linux arm64 Docker
- [ ] Linux arm7 standard
- [ ] Linux arm7 Docker
- [ ] Linux arm6 standard
- [ ] Linux arm6 Docker
- [X] Windows amd64 standard
- [ ] Windows amd64 Docker (WSL backend)
- [ ] macOS amd64 standard
- [ ] macOS amd64 Docker
- [ ] Other (please describe)
Describe the issue
We are utilizing a matrox MURA-IPX-I4EHF card which captures local screen display content and provides an RTSP server endpoint to stream the video with 1 H264 video track.
Depending on the exact resolutions and orientation (e.g, when the display surface is a vertical orientation and streaming a resolution of 854x1920), the matrox encoder will encode frames using more than one slice per frame (two IDR slices in an IDR keyframe, 2 NonIDR slices in the P frame).
In these specific situations, playback in the browser is garbled, and is reported (by the webRTC internals of chrome) at twice the actual bitrate of the stream (eg, reports 60 framesPerSecond when the source stream is 30FPS).
I have done what investigation I can to try to identify the inflection point where this behavior emerges to help write this ticket.
To illustrate, here are some debug prints I added to track the processing of RTP packets on the first frames of playback (in formatprocessor/h264.go and gortsplib's rtph264/encoder.go, context is probably clear enough)
2023/12/19 08:54:20 INF [WebRTC] [session 7fb00a20] is reading from path 'device', 1 track (H264)
2023/12/19 08:54:20 ProcessRTPPacket, pkt yields 6 aus before remux
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit receiving 6 au
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit nalu typ SPS
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit nalu typ PPS
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit nalu typ SEI
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit nalu typ SEI
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit nalu typ SEI
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit nalu typ IDR
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit adding SPS/PPS on IDR
2023/12/19 08:54:20 ProcessRTPPacket queues 6 AU after remux
2023/12/19 08:54:20 ProcessRTPPacket finished, routing as-as
2023/12/19 08:54:20 gortsplib rtph264.Encode encoding 6 au
2023/12/19 08:54:20 ProcessRTPPacket, pkt yields 1 aus before remux
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit receiving 1 au
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit nalu typ IDR
2023/12/19 08:54:20 formatProcessor.remuxAccessUnit adding SPS/PPS on IDR
2023/12/19 08:54:20 ProcessRTPPacket queues 3 AU after remux
2023/12/19 08:54:20 ProcessRTPPacket finished, routing as-as
2023/12/19 08:54:20 gortsplib rtph264.Encode encoding 3 au
I am not conversant enough in RTP standards to comment further on whether this is correct or incorrect by any spec.
I have done experiments on generating H264 streams with multiple slices per frame from FFMPEG with libx264 which play fine through mediaMTX because each pair of IDR/NonIDR is handled in one call to ProcessRTPPacket/remuxAccessUnit, and are in turn passed back out as filteredNALUs and written to the webRTC writer together in a single pass.
I'm hopeful that attaching a PCAP will provide enough data to comment further.
Describe how to replicate the issue
- start the server with RTSP and webRTC enabled.
- Configure the matrox card for an IP Stream Out, with RTSP
- Define a path in mediaMTX with source set to the rtsp:// url of the matrix card output
- Access the mediaMTX webRTC playback page for that path in Chrome (firefox also plays poorly but more visible in chrome).
Did you attach the server logs?
yes
Did you attach a network dump?
yes
Hello @jsbohnert, i took a look at the dump you sent, and i can say that the difference between a stream generated with x264 and the stream generated with the Matrox card is that the latter splits slices belonging to the same frame (access unit) into different access units.
In RTP packets there's a flag called Marker that signals the end of an access unit. When this server receives the Marker flag from a source, it takes all NALUs received until this moment and interprets them as a single access unit. If there's a WebRTC reader, the access unit is re-encoded into RTP packets (but with a lower MTU) and sent out.
This behavior is visible from the packet dump, in which there are multiple packets with the Marker flags and the same timestamp (therefore belonging to the same access unit):
Therefore: the stream sent out by Matrox is bugged. However we could heal this stream by by changing the grouping mechanism inside the server: we could group NALUs together when the timestamp changes, but this introduces latency, since we have to wait for the next access unit before using the previous one.
Maybe the server can be changed in order to apply this healing algorithm selectively. It's not the first time that i've seen streams with splitted slices, so this should be of public interest.
Thank you for your response and analysis - your notes will be helpful in my support discussions about the encoder card.
The proposed solution would be valuable and welcome to test in future for our needs if it becomes available - our environment is controlled and limited to specific hardware so a opt-in behavior on the server or stream config would be worthwhile in this case.