feat: Collects unexpected boxes during decoding
closes #63
This change introduces the ability to collect and retain any unexpected boxes encountered during the decoding process.
Previously, the decoder would give an Error::unexpectedBox when encounters an unexpected box thus failing the decode process. Now, these boxes are stored in a unexpected Vec within their parent structures (Moov, Trak, Mdia, Minf, Stbl, etc.).
This allows for more robust handling of malformed or extended MP4 files, as the application can now inspect and process unexpected boxes as needed.
Dones
- adds an
unexpectedfield to any parent boxes, aggregating unexpected boxes during decoding- also adds for any SampleEntry boxes
- fixes test code break brought by this new field
- documents the fault-tolerant feature at both README and crate-level doc
Summary by CodeRabbit
-
New Features
- Feature-gated fault-tolerant parsing: unknown child boxes can be collected during decode (decode may succeed) and are not re-encoded.
-
Documentation
- Added “Fault-Tolerant Parsing” section explaining behavior, configuration example, and notes.
-
Public API
- Many container types gain a feature-gated unexpected field; several types no longer auto-derive Eq (equality semantics change when feature enabled).
-
Tests
- Expanded tests for fault-tolerant decoding, multiple unexpected boxes, and encode/decode expectations.
Walkthrough
Adds a Cargo feature fault-tolerant that, when enabled, collects unrecognized child boxes into a feature-gated unexpected: Vec<Any> field on many container and codec structs instead of returning UnexpectedBox errors. Decode loops in atom/box decoders push unknown atoms (with a warning) into that collection under the feature flag; encoding does not write unexpected entries. Numerous structs and tests were updated to include the field and several Eq derives were removed. Documentation and feature-gated tests for the behavior were added.
Pre-merge checks and finishing touches
✅ Passed checks (5 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title Check | ✅ Passed | The pull request title "feat: Collects unexpected boxes during decoding" accurately and concisely describes the primary change in this changeset. The title clearly communicates the main objective—implementing a feature that collects unexpected boxes during decoding rather than immediately failing—which aligns directly with the core changes visible across the codebase (adding an unexpected: Vec<Any> field, updating decode paths, and gating the behavior behind a feature flag). The title is specific and meaningful without being overly verbose or using vague terminology. |
| Linked Issues Check | ✅ Passed | The changes meet the core objective from linked issue #63, which reported that decoding a minf box containing an hdlr box resulted in Error::UnexpectedBox and requested support for this box per ISO/IEC 14496-12:2022. The PR implements a general fault-tolerant parsing system that collects unexpected boxes (including hdlr) into an unexpected: Vec<Any> field when the fault-tolerant feature is enabled, preventing decoding failures. While the solution takes a broader approach than specifically adding hdlr support, it achieves the stated objective by allowing any unexpected box—including hdlr in minf—to be tolerated during decoding when the feature is activated. When the feature is disabled, the original strict error behavior is preserved. |
| Out of Scope Changes Check | ✅ Passed | The code changes are directly related to implementing the fault-tolerant decoding feature described in the PR objectives. All structural modifications—adding unexpected: Vec<Any> fields, updating decode paths to collect unknown atoms, modifying tests, and adding feature documentation—are integral to this feature. The removal of Eq from derive lists across multiple structs, while affecting the public API, appears to be a necessary technical consequence of adding unexpected fields containing Any values that do not implement Eq. No changes appear to be unrelated to the fault-tolerant feature implementation or issue #63 resolution. |
| Docstring Coverage | ✅ Passed | Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%. |
✨ Finishing touches
- [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
The bot review comments include a part about skipping serialisation, which I'm not confident about, but might be worth considering more seriously.
I am not sure about the skipping serialization part either. The serde feature here is about outputting a json, not the actual file. We sure can skip this as the bot suggested. But then do we skip all other empty Vec for consistency?
If we are going to do this, it probably should work the same for meta (which is a pseudo-container).
The MetaBox (14496-12 8.11.1.2) is kind of special IMO, it currently haven't been well-implemented yet[^1]. Maybe we should leave it to another dedicated pr?
It possibly should be the same way we handle extra stuff below SampleEntry. That is, given this part from ISO/IEC 14496-12 8.5.2.1
I don't see that paragraph from my specs(14496-12/2012), can you double confirm is that the case before I go ahead and put the logic for these boxes?
[^1]: Unexpected boxes may only be used when we have all the expected box field in it. Currently we don't, so doing it along with this PR may be a bit messy?
If we are going to do this, it probably should work the same for meta (which is a pseudo-container).
The MetaBox (14496-12 8.11.1.2) is kind of special IMO, it currently haven't been well-implemented yet1. Maybe we should leave it to another dedicated pr?
I'm OK with that.
It possibly should be the same way we handle extra stuff below SampleEntry. That is, given this part from ISO/IEC 14496-12 8.5.2.1
I don't see that paragraph from my specs(14496-12/2012), can you double confirm is that the case before I go ahead and put the logic for these boxes?
Here is a snip from ISO/IEC 14496-12:2022:
It is also in some derived specs, for example the last paragraph in this section of the AV1 bindings: https://aomediacodec.github.io/av1-isobmff/#av1sampleentry-semantics