Feature request: fault-tolerant decoding
Getting an Error::UnexpectedBox when decoding a minf box with hdlr box.
According to the specs, ISO/IEC 14496-12:2022, 8.4.3.1:
"The Handler Reference Box (hdlr) specifies the media handler for the media information in this media information box (minf). It shall be present and contains the handler type (e.g., 'vide' for video, 'soun' for audio)."
I think we are missing this box for minf?
I have the document, but I don't see those words in that section.
My bad! I checked the document, and you're right—it's another LLM illusion.
Earlier today, I ran the library on a large set of MP4 files and noticed a few had an hdlr box inside their minf. So, I asked an LLM to check the spec and list the source. Thanks for pointing it out—I'll definitely be more careful next time.
It’s probably some muxers not sticking to the standard.
This brings up a question: should we validate this in the parser? When using the library, Error::UnexpectedAtom is only helpful for handling non-standard atom structures(when we need to). If we’re using the library as a metadata retriever (or in a media player, etc.), the current setup doesn’t let us skip these “illegal” atoms easily—we have to manually parse them with the Any type.
Maybe we could introduce a loose parsing mode that skips these illegal atoms with a warning, instead of requiring manual parsing? What's your thought on this?
I think that, in general, non-conformance an Error. If you make junk, don't expect me to read it (that is, the "tolerant in what you accept" was a valid thing before people started using file formats as an attack vector).
In the case of hdlr in minf, or anything else that is common (i.e. de-facto standard, if not actually in the document), its a bit more of a judgement call. If there is data that absolutely needs to be available to support parsing, then its a pretty clear basis for inclusion. So can you tell which muxer made the data, or what the content is? Does it play in VLC or ffmpeg? Can you share the overall structure if not the bits?
I could see use cases for parsing known-but-not-properly-placed boxes in general. Not sure whether that should be a mode or a feature setting. @kixelated thoughts on this?
I would be -1 on allowing mp4-atom to write those though. The current implementation doesn't require sensible content, but it does provide useful structural guard rails, and taking away those guard rails would be undesirable.
I would be -1 on allowing mp4-atom to write those though. The current implementation doesn't require sensible content, but it does provide useful structural guard rails, and taking away those guard rails would be undesirable.
Totally agree. We should make it, if not impossible, difficult to write nonstandard structure.
In the case of
hdlrinminf, or anything else that is common (i.e. de-facto standard, if not actually in the document), its a bit more of a judgement call. If there is data that absolutely needs to be available to support parsing, then its a pretty clear basis for inclusion. So can you tell which muxer made the data, or what the content is? Does it play in VLC or ffmpeg? Can you share the overall structure if not the bits?
I'll take a look later and try to see if i can identify some common patterns, like which muxing app created it or what rules they are using.
The files can all play on mpv. Going to test them in VLC and FFmpeg (ffplay/ffprobe) next, then preferably share their moov.
It's quite easy to create non-standard MP4 files and this library shouldn't be the standards police. Yeah +1 to decoding or skipping non-standard atoms, but don't make it easy to encode them.
I'm okay with an pub unknown: Vec<Any> in most boxes instead of returning an UnexpectedBox. Logging a warning and skipping them is an option too, but obviously not great for extensibility.
Really my main goal for this library is to provide as much strong typing as possible with MP4. The libraries that treat boxes like arbitrary key/value pairs are the most compatible, but also the most difficult to use. We should provide strong types for good behavior but not choke on bad behavior either.