srs icon indicating copy to clipboard operation
srs copied to clipboard

RTC2RTMP frame freezing or stuttering

Open se7enXF opened this issue 7 months ago • 3 comments

In Unity3D, I am using WHIP to push video and audio to SRS (Simple Real-time Streaming Server), and then playing the video stream on the page http://localhost:8080/players/whep.html works fine. However, when I use ffplay or VLC to pull the RTMP stream from rtmp://localhost/live/livestream, the video playback is choppy, although the audio is normal.

The Unity code is based on the example from https://github.com/bluenviron/mediamtx:

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using Unity.WebRTC;
using System.Linq;


/// <summary>
/// Communicating with SRS and pushing streams using the WHIP protocol.
/// </summary>
public class WebRtcStreaming : MonoBehaviour
{
    private string url = "http://localhost:1985/rtc/v1/whip/?app=live&stream=livestream";
    private RTCPeerConnection pc;
    private readonly List<MediaStreamTrack> mediaStreamTracks = new();


    void Start()
    {
        StartCoroutine(WebRTC.Update());
    }
    
    public IEnumerator StartPushing()
    {
        pc = new RTCPeerConnection
        {
            OnConnectionStateChange = status =>
            {
                LogTrace.Log($"RTC connect status: {status}");
                if (status == RTCPeerConnectionState.Failed)
                {
                    StartCoroutine(StopPushing());
                }
            },
        };


        // add video
        var videoStream = Camera.main.CaptureStream(Screen.width, Screen.height);
        var track = videoStream.GetTracks().First();
        mediaStreamTracks.Add(track);
        
        // add audio
        var inputAudioSource = GetComponent<AudioListener>();
        var audioStream = new AudioStreamTrack(inputAudioSource);
        mediaStreamTracks.Add(audioStream);


        foreach (var tk in mediaStreamTracks)
        {
            var sender = pc.AddTrack(tk);
            LogTrace.Log($"RTC add {tk.GetType()}");


            if (tk.Kind == TrackKind.Video)
            {
                // Get `RTCRtpSendParameters`
                var parameters = sender.GetParameters();


                // Changing bitrate of all encoders.
                foreach (var encoding in parameters.encodings)
                {
                    encoding.maxFramerate = 30;
# if UNITY_EDITOR
                    encoding.scaleResolutionDownBy = 2;
# endif
                }


                // Set updated parameters.
                sender.SetParameters(parameters);
            }
        }


        StartCoroutine(CreateOffer());
        yield return null;
    }


    private IEnumerator CreateOffer()
    {
        var op = pc.CreateOffer();
        yield return op;
        if (op.IsError)
        {
            LogTrace.LogError("CreateOffer() failed");
            yield break;
        }


        yield return SetLocalDescription(op.Desc);
    }


    private IEnumerator SetLocalDescription(RTCSessionDescription offer)
    {
        var op = pc.SetLocalDescription(ref offer);
        yield return op;
        if (op.IsError)
        {
            LogTrace.LogError("SetLocalDescription() failed");
            yield break;
        }


        yield return PostOffer(offer);
    }


    private IEnumerator PostOffer(RTCSessionDescription offer)
    {
        var content = new System.Net.Http.StringContent(offer.sdp);
        content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("application/sdp");
        var client = new System.Net.Http.HttpClient();


        var task = System.Threading.Tasks.Task.Run(async () => {
            var res = await client.PostAsync(new System.UriBuilder(url).Uri, content);
            res.EnsureSuccessStatusCode();
            return await res.Content.ReadAsStringAsync();
        });
        yield return new WaitUntil(() => task.IsCompleted);
        if (task.Exception != null)
        {
            Debug.LogError(task.Exception);
            yield break;
        }


        yield return SetRemoteDescription(task.Result);
    }


    private IEnumerator SetRemoteDescription(string answer)
    {
        RTCSessionDescription desc = new()
        {
            type = RTCSdpType.Answer,
            sdp = answer
        };
        var op = pc.SetRemoteDescription(ref desc);
        yield return op;
        if (op.IsError)
        {
            LogTrace.LogError("SetRemoteDescription() failed");
            yield break;
        }


        yield break;
    }


    public IEnumerator StopPushing() 
    {
        foreach (var tk in mediaStreamTracks)
        {
            tk?.Stop();
            tk?.Dispose();
        }
        mediaStreamTracks.Clear();


        pc?.Close();
        pc?.Dispose();
        yield return null;
        LogTrace.Log("RTC Stopped");
    }
}

The source data read by ffplay is as follows:

Input #0, flv, from 'rtmp://localhost/live/livestream':  0B f=0/0
  Metadata:
    |RtmpSampleAccess: true
  Duration: N/A, start: 0.000000, bitrate: N/A
  Stream #0:0: Data: none
  Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp
  Stream #0:2: Video: h264 (Constrained Baseline), yuv420p(progressive), 960x540 [SAR 1:1 DAR 16:9], 30.30 fps, 30 tbr, 1k tbn

I suspect there might be a Data stream that is not sending data through WHIP, causing a blockage during the conversion to RTMP. This is because in previous tests with other scenarios, when I added audio and video tracks, if the video track had no data, the audio heard through the RTMP stream was odd. In the aforementioned WHIP code, I only added audio and video streams, and I'm unsure about the origin of the data stream.

Part of the SRS log is as follows:

[2025-05-23 10:28:48.226][INFO][1268][46714s76] RTC: Server conns=1, rpkts=(199,rtp:196,stun:1,rtcp:3), spkts=(14,rtp:0,stun:1,rtcp:27), rtcp=(pli:1,twcc:9,rr:1), snk=(88,a:44,v:44,h:0), fid=(id:0,fid:199,ffid:0,addr:1,faddr:199)
[2025-05-23 10:28:48.366][WARN][1268][g5i48813][11] empty nalu
[2025-05-23 10:28:49.939][INFO][1268][0q7mfs3z] RTC: to rtmp bridge request key frame, ssrc=705817926, publisher cid=g5i48813
[2025-05-23 10:28:49.970][INFO][1268][g5i48813] 45B video sh,  codec(7, profile=Baseline, level=5.1, 960x540, 0kbps, 0.0fps, 0.0s)
[2025-05-23 10:28:49.971][INFO][1268][g5i48813] set ts=1576724422, header=26786, lost=26787
[2025-05-23 10:28:53.145][INFO][1268][g5i48813] -> HLS time=1468350789ms, sno=119, ts=livestream-118.ts, dur=9333ms, dva=0p
[2025-05-23 10:28:53.213][INFO][1268][03j17jwt] -> PLA time=1357705065, msgs=128, okbps=0,500,504, ikbps=0,0,0, mw=350/8
[2025-05-23 10:28:53.228][INFO][1268][46714s76] Hybrid cpu=0.00%,38MB, cid=1,3, timer=57,10,44, clock=0,23,12,5,2,0,0,0,0, objs=(pkt:342,raw:56,fua:285,msg:569,oth:1,buf:196)
[2025-05-23 10:28:53.228][INFO][1268][46714s76] RTC: Server conns=1, rpkts=(199,rtp:196,stun:1,rtcp:3), spkts=(14,rtp:0,stun:1,rtcp:27), rtcp=(pli:1,twcc:9,rr:1), snk=(88,a:44,v:44,h:0), fid=(id:0,fid:199,ffid:0,addr:1,faddr:199)
[2025-05-23 10:28:53.364][WARN][1268][g5i48813][11] empty nalu
[2025-05-23 10:28:55.776][WARN][1268][g5i48813][11] clear gop cache for guess pure audio overflow
[2025-05-23 10:28:56.082][INFO][1268][0q7mfs3z] RTC: to rtmp bridge request key frame, ssrc=705817926, publisher cid=g5i48813
[2025-05-23 10:28:56.082][INFO][1268][0q7mfs3z] RTC: Need PLI ssrc=705817926, play=[g5i48813], publish=[g5i48813], count=240/240
[2025-05-23 10:28:56.084][INFO][1268][g5i48813] RTC: Request PLI ssrc=705817926, play=[g5i48813], count=240/240, bytes=12B
[2025-05-23 10:28:56.098][INFO][1268][g5i48813] 45B video sh,  codec(7, profile=Baseline, level=5.1, 960x540, 0kbps, 0.0fps, 0.0s)
[2025-05-23 10:28:56.098][INFO][1268][g5i48813] set ts=1577276392, header=27659, lost=27660
[2025-05-23 10:28:56.364][INFO][1268][03o39x2o] <- RTC RECV #9, udp 1999, pps 164/199, schedule 1999
[2025-05-23 10:28:58.232][INFO][1268][46714s76] Hybrid cpu=3.19%,38MB, cid=1,3, timer=57,10,44, clock=0,23,12,5,2,0,0,0,0, objs=(pkt:342,raw:56,fua:285,msg:569,oth:1,buf:196)

I'm seeking advice from everyone, is there a solution to this issue?

TRANS_BY_GPT4

se7enXF avatar May 23 '25 02:05 se7enXF

I just find new version fix the problem, the same issue https://github.com/ossrs/srs/pull/4160

se7enXF avatar May 23 '25 03:05 se7enXF

I apologize for revisiting this issue, as it still persists in my project. Before using the update from https://github.com/ossrs/srs/pull/4160, there were multiple instances of freezing within a second; after applying the update, the freezing occurs once every few seconds to tens of seconds.

TRANS_BY_GPT4

se7enXF avatar May 23 '25 06:05 se7enXF

With version https://github.com/ossrs/srs/releases/tag/v6.0-a2 , ffplay log is different:

Input #0, flv, from 'rtmp://localhost/live/livestream':  0B f=0/0
  Metadata:
    |RtmpSampleAccess: true
  Duration: N/A, start: 0.019000, bitrate: N/A
  Stream #0:0: Audio: aac (LC), 48000 Hz, stereo, fltp
  Stream #0:1: Video: h264 (Constrained Baseline), yuv420p(progressive), 960x540 [SAR 1:1 DAR 16:9], 30.30 fps, 30 tbr, 1k tbn
[h264 @ 000001983d2a3b80] Frame num change from 144 to 145B f=0/0
[h264 @ 000001983d2a3b80] decode_slice_header error
[h264 @ 000001983d2a3b80] Frame num change from 144 to 146
[h264 @ 000001983d2a3b80] decode_slice_header error
[h264 @ 000001984295a940] Frame num change from 180 to 181B f=0/0
[h264 @ 000001984295a940] decode_slice_header error
[h264 @ 000001984295a940] Frame num change from 180 to 182
[h264 @ 000001984295a940] decode_slice_header error

se7enXF avatar May 23 '25 07:05 se7enXF

Thank you for reporting this issue. After reviewing your logs and code, I can see this is a complex RTC-to-RTMP bridging problem that involves multiple factors:

Observations from Your Logs:

  1. [WARN] empty nalu - SRS is receiving RTP packets with empty or incomplete H.264 NAL units from your Unity WebRTC stream
  2. [WARN] clear gop cache for guess pure audio overflow - The GOP cache is being cleared because too many audio packets arrive without valid video frames
  3. Frequent PLI requests - SRS is constantly requesting keyframes (count=240/240), indicating video frame assembly issues
  4. WHEP playback operates correctly - This indicates that the problem is isolated to the process of converting from RTC to RTMP.

About the "Data: none" Stream:

The Stream #0:0: Data: none you see in ffplay output is NOT the cause of your problem. This is the standard RTMP metadata stream (AMF0 data messages like |RtmpSampleAccess) and is completely normal in RTMP/FLV streams.

The Real Issue:

This is a very complex problem that cannot be diagnosed or fixed remotely through code review or log analysis alone. The stuttering could be caused by:

  • Unity WebRTC encoder configuration issues
  • Video capture timing problems in Unity
  • Network packet loss or reordering
  • Encoder keyframe generation issues
  • RTP packet fragmentation/assembly problems
  • Timing synchronization between audio and video

What You Need to Do:

Since this issue is too complex to locate the root cause by code or logs alone, you will need to investigate and fix it yourself. Here are the steps I recommend:

Capture and analyze the actual RTP packets:

  • Use Wireshark to capture the WebRTC traffic between Unity and SRS
  • Check if Unity is sending complete, valid H.264 NAL units
  • Look for packet loss, reordering, or timing issues

If you discover the root cause through your investigation, please share your findings here so others can benefit!

TRANS_BY_GPT4

winlinvip avatar Oct 25 '25 23:10 winlinvip

This issue is not clear, or lack of information.

winlinvip avatar Oct 31 '25 22:10 winlinvip