tsMuxer
tsMuxer copied to clipboard
[Bug] Incorrect TS buffers management
For the record: TSMuxer calculates the ATS (Arrival Time Stamps) of each M2TS packet by simply dividing the time between two consecutive frames/DTS by the number of packets. This is incorrect (i.e. not complying with T-REC H.222.0 = ISO/IEC 13181-1) and creates constant underflow / overflow of the various TS buffers.
How does this affect #101 that was just merged?
@justdan96 welcome back, happy new year ! #101 has only prevented the TS packets (each 192 bytes) arriving faster than the transfer rate allows. It does not solve the way tsMuxer calculates a simplistic time gap between packets, by dividing the time between two video frames (1/fps) by the number of packets. The way it should be done is quite well explained by drmpeg e.g. here and here.
I don't claim to be an expert when it comes to transport streams or the code that tsMuxer currently uses for packetisation, but is as far as I remember from my days in dabbling with MPEG-TS, it is always best to try and keep the average bitrate of the whole stream as constant as possible. This means that doing this properly is not possible without scanning every elementary stream in order to identify places where packets need to be more condensed, while at the same time keeping the PCR gap smaller than 100ms.
The posts you've linked provide a really good explanation. The official TS specification (the previous version is freely available) also provides some information in section 2.4 and Annex Q, but I have to admit that the language is hard to get through for somebody inexperienced like me. I can't see any viable solution in terms of code right now, especially not knowing the current state of things.
Even if we choose to offer two packetisation strategies/algorithms : the current one and the new one (supposedly more compliant to the T-STD buffer stuff), this would require isolating the current code, which I expect to be splattered around lots of various places.
Are there any reference implementations we could crib off? Or do you think isolating the current implementation would be the first priority?
Unfortunately, lack of a reference implementation is one of the gripes that people using the standard usually have with it. The documentation is very complex, but there is no reference - not even "certified standard-compliant" reference streams - in order to compare one's implementation to. There isn't a lot of transport stream muxers with available sources - most of these programs are proprietary due to their niche nature. The only ones that immediately come to mind are ffmpeg (link to github mirror, their own git frontend seems to be having some problems) and gstreamer, however I wouldn't count on either of them aiming for standard compliance. At the same time, I have never taken the time myself to analyse their sources when it comes to this particular aspect.
As for having two separate packetisation implementations, I think it's best if @jcdr428 comments, since he's the one of us who probably has the most knowledge of the impact that this issue has in real usage.
@xavery T-REC-H.222.0 is incredibly obscure, but the core principle is very basic: each stream must be moved to packets at its own constant bitrate. Packets between various streams can be interleaved as required.
For M2TS you can simply take video = BD Transfer Rate of 48 (or UHD 109mbps), AC3/LPCM/DTSCore at their CBR bitrate, and TrueHD/DTSMA at same rate as CBR LPCM. For TS, to limit size (i.e. number of added null packets) this would need a preliminary scan of non-CBR streams.
Currently, tsMuxer just piles each stream frame one after the other, and gives them the same average bitrate between two PCR packets (around every 100ms) -so e.g. a 320kpbs AC3 frame could end up with arrival rates as high as 109 mpbs, totally overflowing the corresponding buffer. Not a problem on PCs, but it might be for some standalone players.
Yes correcting this would need a total rewrite of the tsMuxer.cpp... I am progressing in C++ -thanks to practising with tsMuxer- but this is currently out of my league :(
Edit : on the bright side, TsMuxer produces TS far better than ffmpeg.
We're using tsMuxer (I've been used to write tsMuxeR - is that "deprecated" now?) to mux video for a UPnP AV/DLNA server, and have had "strange experiences" with the output for years.
The "clients", called renderers, are often "hardware devices" like TVs, Blu-ray players or game consoles. While many renderers happily accept tsMuxer's output, some flat our reject them (claiming "invalid"/"corrupt" stream or similar). This "feels" like a deviation from the standard that many implementations don't care about while some do, and this issue seems like a perfect candidate for an explanation of the phenomena.
It should be mentioned that while FFmpeg's MPEG-TS output also has issues, especially with timing information if I remember correctly, its output is generally accepted as "valid" by renderers. So, despite imperfections in FFmpeg's implementation, they seem to have gotten something right with regards to "compatibility". I guess it could also be that their output is widespread, so that decoder implementations are likely to be tested with FFmpeg's output.
I just discovered that tsMuxer had been open-sourced :+1: right now, so we have been using the 2014 version exclusively. That means that this might be a red herring and that whatever issue caused some renderers to choke has already been addressed by other fixes. This issue just sounds like a "perfect" explanation.
What project are you using tsMuxer in? We may be able to sync up on error reporting to ensure it is the same issue.
I'm currently working on Digital Media Server a fork of Universal Media Server where I used to be a developer. Both are essentially continuations of PS3 Media Server.
There have been many issues over the years, the vast majority haven't been posted on GitHub but on forums, and most have nothing to do with tsMuxer - so I don't think it would be fruitful to try to chase that down.
That said, I will definitely report here if/when I have some current issues that seems like a tsMuxer output issue.
I'm familiar with UMS, will have to check out DMS! Maybe you could include the open source tsMuxer as an option in your software, it would be good to get some more user reports on what does and doesn't work.
I certainly will when I get the time, but I’m currently bogged down with other issues so I don't know when I'll get there.
Edit: Regarding DMS, there has been no public release yet, so testing it is a bit premature. I've been working on a big refactoring for about a year now. Since so many things are being rewritten it doesn't even compile most of the time, so I haven't bothered to push this to GitHub yet. Once this is finished (which could still take some months I'm afraid), making a first release is high on the agenda.
@Nadahar as the release of 2.7 is approaching, it would be good to have a quick report of what issues you've met with tsMuxer, so that we can try to fix it before release of new version.
So, despite imperfections in FFmpeg's implementation, they seem to have gotten something right with regards to "compatibility"
Maybe, as ffmpeg does, write the menu to TS and M2TS (as mediainfo shows): Menu ID : 4096 (0x1000) Menu ID : 1 (0x1) Duration : 20 s 160 ms List : 256 (0x100) (AVC) / 257 (0x101) (AAC) Service name : Service01 Service provider : FFmpeg Service type : digital television
@jcdr428 Sorry, I missed the notification for some reason. As stated, we haven't been using the open-source version yet, so I don't even know if the previous issues still exist. In addition, I've been through, probably thousands, of user cases the last 5 years, so most of it is just a blur in my mind. Very few of these issues have been related to tsMuxer, so I simply can't come up with something concrete here and now.
All I remember is that it seems to have been a recurring theme that the output created by tsMuxer worked fine on most renderers (TVs, game consoles, blueray players etc), while it didn't work on others. I don't remember which devices and under which circumstances, sorry.
If/When I have something concrete in the future, I will make sure to let you know.
Did you see https://patchwork.ffmpeg.org/project/ffmpeg/patch/Hld3RVCojMUWQ2IXngnngo5wsLPOg7i1rVgrJRcCVnH8B-9j1Drk5rMxDWaHommrh7g8YLdcNZf7lLb8syKHXLCUITZ6zwg_6Dt-4iRP0aE=@protonmail.com/?
@xavery @jcdr428 @justdan96 there is also https://github.com/kierank/libmpegts
also see five (unmerged from few days ago look) commits at https://github.com/cus/ffmpeg/commits/mpegts2 (Marton Balint's personal repo)
especially https://github.com/cus/ffmpeg/commit/5fbf34b3409094e13aaf9d0f8f6f0314da9fc1a4 (emulate CBR in m2ts mode)?
I'd just like to add that while scanning VBR streams first is a possibility when muxing content not for "immediate consumption", it would be impossible to combine with how DMS and UMS is using it. There's simply no time to do a scan before starting to mux. UPnP AV/DLNA implements seeking in such a way that renderers will send the server a request for playback from a new time position where "time zero" is the start of the media. Since there's no way to tell tsMuxer, FFmpeg or any other such tool that I'm familiar with to "update" their source position on-the-fly, we must thus terminate the current transcoding/muxing operation and start a new one. Some renderers will send these requests very frequently, to simulate a kind of "fast forward" where you see single frames as you seek. This whole operation is already slow enough that it's "slightly painful" to use REW and FFWD, and scanning the sources would break this completely.
From what little I've studied ISO/IEC 13818-1/H.222.0, it seems to me like the idea is that you should use the information from the elementary streams where that applies, and that the standard defines constant values (buffer size, maximum bitrate) for elementary stream types where this information isn't available. It is also specified how to calculate overhead, so the idea that you determine the buffer sizes and maximum bit rates based on available information seems to be fundamental to how the standard is written.
This will of course mean that if the elementary streams don't give correct information, you may end up with a broken transport stream, but there's really no other way unless you have the time to scan the elementary streams first. I'd say that the deal solution would be to support both approaches, one that could be used in real-time, as it intended, and one that offered the possibility to make a "valid" transport stream from "invalid" elementary streams through scanning.
The thing is though, that I'm not sure if those "invalid" elementary streams will play well with those decoders that will have issues with a transport stream that doesn't follow the STD buffer rules either. Most of the video codecs have similar buffering schemes (max buffer size and bitrate), and all those non-compliant streams out there typically cause problems for many hardware decoders.
It should be added that x264 didn't do "the world" any favors in this regard when they decided not to respect the maximum buffer size and bitrate dictated by the profile and level combination. It's thus very easy to make non-compliant streams using x264, because the encoder must manually specify values that keep it within spec. This has probably led to a general distrust in using the information given by H.264/AVC streams. x265 seems to have taken a better path though, as far as I can tell, it does enforce the limits if a profile and level is specified when encoding, so we can only hope that the same will be true for the most widely used encoders of other video codecs in the future like AV1, VVC, EVC etc.