RTC2RTMP frame freezing or stuttering
In Unity3D, I am using WHIP to push video and audio to SRS (Simple Real-time Streaming Server), and then playing the video stream on the page http://localhost:8080/players/whep.html works fine. However, when I use ffplay or VLC to pull the RTMP stream from rtmp://localhost/live/livestream, the video playback is choppy, although the audio is normal.
The Unity code is based on the example from https://github.com/bluenviron/mediamtx:
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using Unity.WebRTC;
using System.Linq;
/// <summary>
/// Communicating with SRS and pushing streams using the WHIP protocol.
/// </summary>
public class WebRtcStreaming : MonoBehaviour
{
private string url = "http://localhost:1985/rtc/v1/whip/?app=live&stream=livestream";
private RTCPeerConnection pc;
private readonly List<MediaStreamTrack> mediaStreamTracks = new();
void Start()
{
StartCoroutine(WebRTC.Update());
}
public IEnumerator StartPushing()
{
pc = new RTCPeerConnection
{
OnConnectionStateChange = status =>
{
LogTrace.Log($"RTC connect status: {status}");
if (status == RTCPeerConnectionState.Failed)
{
StartCoroutine(StopPushing());
}
},
};
// add video
var videoStream = Camera.main.CaptureStream(Screen.width, Screen.height);
var track = videoStream.GetTracks().First();
mediaStreamTracks.Add(track);
// add audio
var inputAudioSource = GetComponent<AudioListener>();
var audioStream = new AudioStreamTrack(inputAudioSource);
mediaStreamTracks.Add(audioStream);
foreach (var tk in mediaStreamTracks)
{
var sender = pc.AddTrack(tk);
LogTrace.Log($"RTC add {tk.GetType()}");
if (tk.Kind == TrackKind.Video)
{
// Get `RTCRtpSendParameters`
var parameters = sender.GetParameters();
// Changing bitrate of all encoders.
foreach (var encoding in parameters.encodings)
{
encoding.maxFramerate = 30;
# if UNITY_EDITOR
encoding.scaleResolutionDownBy = 2;
# endif
}
// Set updated parameters.
sender.SetParameters(parameters);
}
}
StartCoroutine(CreateOffer());
yield return null;
}
private IEnumerator CreateOffer()
{
var op = pc.CreateOffer();
yield return op;
if (op.IsError)
{
LogTrace.LogError("CreateOffer() failed");
yield break;
}
yield return SetLocalDescription(op.Desc);
}
private IEnumerator SetLocalDescription(RTCSessionDescription offer)
{
var op = pc.SetLocalDescription(ref offer);
yield return op;
if (op.IsError)
{
LogTrace.LogError("SetLocalDescription() failed");
yield break;
}
yield return PostOffer(offer);
}
private IEnumerator PostOffer(RTCSessionDescription offer)
{
var content = new System.Net.Http.StringContent(offer.sdp);
content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("application/sdp");
var client = new System.Net.Http.HttpClient();
var task = System.Threading.Tasks.Task.Run(async () => {
var res = await client.PostAsync(new System.UriBuilder(url).Uri, content);
res.EnsureSuccessStatusCode();
return await res.Content.ReadAsStringAsync();
});
yield return new WaitUntil(() => task.IsCompleted);
if (task.Exception != null)
{
Debug.LogError(task.Exception);
yield break;
}
yield return SetRemoteDescription(task.Result);
}
private IEnumerator SetRemoteDescription(string answer)
{
RTCSessionDescription desc = new()
{
type = RTCSdpType.Answer,
sdp = answer
};
var op = pc.SetRemoteDescription(ref desc);
yield return op;
if (op.IsError)
{
LogTrace.LogError("SetRemoteDescription() failed");
yield break;
}
yield break;
}
public IEnumerator StopPushing()
{
foreach (var tk in mediaStreamTracks)
{
tk?.Stop();
tk?.Dispose();
}
mediaStreamTracks.Clear();
pc?.Close();
pc?.Dispose();
yield return null;
LogTrace.Log("RTC Stopped");
}
}
The source data read by ffplay is as follows:
Input #0, flv, from 'rtmp://localhost/live/livestream': 0B f=0/0
Metadata:
|RtmpSampleAccess: true
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0: Data: none
Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp
Stream #0:2: Video: h264 (Constrained Baseline), yuv420p(progressive), 960x540 [SAR 1:1 DAR 16:9], 30.30 fps, 30 tbr, 1k tbn
I suspect there might be a Data stream that is not sending data through WHIP, causing a blockage during the conversion to RTMP. This is because in previous tests with other scenarios, when I added audio and video tracks, if the video track had no data, the audio heard through the RTMP stream was odd. In the aforementioned WHIP code, I only added audio and video streams, and I'm unsure about the origin of the data stream.
Part of the SRS log is as follows:
[2025-05-23 10:28:48.226][INFO][1268][46714s76] RTC: Server conns=1, rpkts=(199,rtp:196,stun:1,rtcp:3), spkts=(14,rtp:0,stun:1,rtcp:27), rtcp=(pli:1,twcc:9,rr:1), snk=(88,a:44,v:44,h:0), fid=(id:0,fid:199,ffid:0,addr:1,faddr:199)
[2025-05-23 10:28:48.366][WARN][1268][g5i48813][11] empty nalu
[2025-05-23 10:28:49.939][INFO][1268][0q7mfs3z] RTC: to rtmp bridge request key frame, ssrc=705817926, publisher cid=g5i48813
[2025-05-23 10:28:49.970][INFO][1268][g5i48813] 45B video sh, codec(7, profile=Baseline, level=5.1, 960x540, 0kbps, 0.0fps, 0.0s)
[2025-05-23 10:28:49.971][INFO][1268][g5i48813] set ts=1576724422, header=26786, lost=26787
[2025-05-23 10:28:53.145][INFO][1268][g5i48813] -> HLS time=1468350789ms, sno=119, ts=livestream-118.ts, dur=9333ms, dva=0p
[2025-05-23 10:28:53.213][INFO][1268][03j17jwt] -> PLA time=1357705065, msgs=128, okbps=0,500,504, ikbps=0,0,0, mw=350/8
[2025-05-23 10:28:53.228][INFO][1268][46714s76] Hybrid cpu=0.00%,38MB, cid=1,3, timer=57,10,44, clock=0,23,12,5,2,0,0,0,0, objs=(pkt:342,raw:56,fua:285,msg:569,oth:1,buf:196)
[2025-05-23 10:28:53.228][INFO][1268][46714s76] RTC: Server conns=1, rpkts=(199,rtp:196,stun:1,rtcp:3), spkts=(14,rtp:0,stun:1,rtcp:27), rtcp=(pli:1,twcc:9,rr:1), snk=(88,a:44,v:44,h:0), fid=(id:0,fid:199,ffid:0,addr:1,faddr:199)
[2025-05-23 10:28:53.364][WARN][1268][g5i48813][11] empty nalu
[2025-05-23 10:28:55.776][WARN][1268][g5i48813][11] clear gop cache for guess pure audio overflow
[2025-05-23 10:28:56.082][INFO][1268][0q7mfs3z] RTC: to rtmp bridge request key frame, ssrc=705817926, publisher cid=g5i48813
[2025-05-23 10:28:56.082][INFO][1268][0q7mfs3z] RTC: Need PLI ssrc=705817926, play=[g5i48813], publish=[g5i48813], count=240/240
[2025-05-23 10:28:56.084][INFO][1268][g5i48813] RTC: Request PLI ssrc=705817926, play=[g5i48813], count=240/240, bytes=12B
[2025-05-23 10:28:56.098][INFO][1268][g5i48813] 45B video sh, codec(7, profile=Baseline, level=5.1, 960x540, 0kbps, 0.0fps, 0.0s)
[2025-05-23 10:28:56.098][INFO][1268][g5i48813] set ts=1577276392, header=27659, lost=27660
[2025-05-23 10:28:56.364][INFO][1268][03o39x2o] <- RTC RECV #9, udp 1999, pps 164/199, schedule 1999
[2025-05-23 10:28:58.232][INFO][1268][46714s76] Hybrid cpu=3.19%,38MB, cid=1,3, timer=57,10,44, clock=0,23,12,5,2,0,0,0,0, objs=(pkt:342,raw:56,fua:285,msg:569,oth:1,buf:196)
I'm seeking advice from everyone, is there a solution to this issue?
TRANS_BY_GPT4
I just find new version fix the problem, the same issue https://github.com/ossrs/srs/pull/4160
I apologize for revisiting this issue, as it still persists in my project. Before using the update from https://github.com/ossrs/srs/pull/4160, there were multiple instances of freezing within a second; after applying the update, the freezing occurs once every few seconds to tens of seconds.
TRANS_BY_GPT4
With version https://github.com/ossrs/srs/releases/tag/v6.0-a2 , ffplay log is different:
Input #0, flv, from 'rtmp://localhost/live/livestream': 0B f=0/0
Metadata:
|RtmpSampleAccess: true
Duration: N/A, start: 0.019000, bitrate: N/A
Stream #0:0: Audio: aac (LC), 48000 Hz, stereo, fltp
Stream #0:1: Video: h264 (Constrained Baseline), yuv420p(progressive), 960x540 [SAR 1:1 DAR 16:9], 30.30 fps, 30 tbr, 1k tbn
[h264 @ 000001983d2a3b80] Frame num change from 144 to 145B f=0/0
[h264 @ 000001983d2a3b80] decode_slice_header error
[h264 @ 000001983d2a3b80] Frame num change from 144 to 146
[h264 @ 000001983d2a3b80] decode_slice_header error
[h264 @ 000001984295a940] Frame num change from 180 to 181B f=0/0
[h264 @ 000001984295a940] decode_slice_header error
[h264 @ 000001984295a940] Frame num change from 180 to 182
[h264 @ 000001984295a940] decode_slice_header error
Thank you for reporting this issue. After reviewing your logs and code, I can see this is a complex RTC-to-RTMP bridging problem that involves multiple factors:
Observations from Your Logs:
[WARN] empty nalu- SRS is receiving RTP packets with empty or incomplete H.264 NAL units from your Unity WebRTC stream[WARN] clear gop cache for guess pure audio overflow- The GOP cache is being cleared because too many audio packets arrive without valid video frames- Frequent PLI requests - SRS is constantly requesting keyframes (
count=240/240), indicating video frame assembly issues - WHEP playback operates correctly - This indicates that the problem is isolated to the process of converting from RTC to RTMP.
About the "Data: none" Stream:
The Stream #0:0: Data: none you see in ffplay output is NOT the cause of your problem. This is the standard RTMP metadata stream (AMF0 data messages like |RtmpSampleAccess) and is completely normal in RTMP/FLV streams.
The Real Issue:
This is a very complex problem that cannot be diagnosed or fixed remotely through code review or log analysis alone. The stuttering could be caused by:
- Unity WebRTC encoder configuration issues
- Video capture timing problems in Unity
- Network packet loss or reordering
- Encoder keyframe generation issues
- RTP packet fragmentation/assembly problems
- Timing synchronization between audio and video
What You Need to Do:
Since this issue is too complex to locate the root cause by code or logs alone, you will need to investigate and fix it yourself. Here are the steps I recommend:
Capture and analyze the actual RTP packets:
- Use Wireshark to capture the WebRTC traffic between Unity and SRS
- Check if Unity is sending complete, valid H.264 NAL units
- Look for packet loss, reordering, or timing issues
If you discover the root cause through your investigation, please share your findings here so others can benefit!
TRANS_BY_GPT4
This issue is not clear, or lack of information.