fix: Add bounds checking to apply_fixups to prevent panic on entry parsing
- Add bounds check before accessing fixup data to prevent index out of bounds panic
- Add bounds check for sector stride data during fixup processing
- Return gracefully with appropriate warnings instead of panicking
- Mark fixups as invalid when bounds are exceeded but continue processing
- Fixes panic when usa_offset and usa_size extend beyond buffer length
This resolves issues with malformed MFT entries that claim to have more fixup data than the actual buffer contains, allowing the parser to continue processing other entries instead of crashing.
Disclaimer: This fix was performed with the assistance of AI
I am working on some tooling to read the MFT I'm dumping from my disk. Relevant file from my side: https://github.com/TeamDman/storage-usage/blob/e1c0493cd4b862d0d2c6c54958a432c6609a118e/storage-usage-v2/src/mft_summarize.rs
❯ cargo run mft summarize C:\Users\TeamD\Downloads\storage\mine.MFT
Finished dev profile [unoptimized + debuginfo] target(s) in 0.27s
Running target\debug\storage-usage-v2.exe mft summarize C:\Users\TeamD\Downloads\storage\mine.MFT
Summarizing MFT file: C:\Users\TeamD\Downloads\storage\mine.MFT
Analyzing MFT entries...
The application panicked (crashed).
Message: range end index 51298 out of range for slice of length 1024
Location: G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\entry.rs:256
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⋮ 12 frames hidden ⋮
13: core::slice::index::slice_end_index_len_fail::do_panic::runtime<unknown>
at /rustc/da58c051315268a197ce280f6ba07bbd03c66535/library\core\src\panic.rs:218
14: core::slice::index::slice_end_index_len_fail<unknown>
at /rustc/da58c051315268a197ce280f6ba07bbd03c66535/library\core\src\panic.rs:223
15: core::slice::index::impl$4::index<unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\slice\index.rs:436
16: core::slice::index::impl$0::index<unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\slice\index.rs:17
17: mft::entry::MftEntry::apply_fixups<unknown>
at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\entry.rs:256
18: mft::entry::MftEntry::from_buffer<unknown>
at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\entry.rs:184
19: mft::mft::MftParser<std::io::buffered::bufreader::BufReaderstd::fs::File >::get_entry<std::io::buffered::bufreader::BufReaderstd::fs::File ><unknown>
at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\mft.rs:80
20: mft::mft::impl$2::iter_entries::closure$0<std::io::buffered::bufreader::BufReaderstd::fs::File ><unknown>
at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\mft.rs:87
21: core::ops::function::impls::impl$4::call_once<unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:305
22: enum2$<core::option::Option<u64> >::map<unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\option.rs:1146
23: core::iter::adapters::map::impl$2::next<enum2$<core::result::Result<mft::entry::MftEntry,enum2$mft::err::Error > >,core::ops::range::Range<u64>,mft::mft::impl$2::iter_entries::closure_env$0<std::io::buffered::bufreader::BufReaderstd::fs::File > ><unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\iter\adapters\map.rs:107
24: core::iter::adapters::enumerate::impl$1::next<core::iter::adapters::map::Map<core::ops::range::Range<u64>,mft::mft::impl$2::iter_entries::closure_env$0<std::io::buffered::bufreader::BufReaderstd::fs::File > > ><unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\iter\adapters\enumerate.rs:80
25: storage_usage_v2::mft_summarize::summarize_mft_file<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\mft_summarize.rs:29
26: storage_usage_v2::cli::mft_summarize_action::MftSummarizeArgs::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\mft_summarize_action.rs:24
27: enum2$<storage_usage_v2::cli::mft_action::MftAction>::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\mft_action.rs:42
28: storage_usage_v2::cli::mft_action::MftArgs::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\mft_action.rs:19
29: enum2$<storage_usage_v2::cli::action::Action>::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\action.rs:17
30: storage_usage_v2::cli::Cli::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\mod.rs:31
31: storage_usage_v2::main<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\main.rs:15
32: core::ops::function::FnOnce::call_once<enum2$<core::result::Result<tuple$<>,eyre::Report> > ()(),tuple$<> ><unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:250
33: std::sys::backtrace::__rust_begin_short_backtrace<enum2$<core::result::Result<tuple$<>,eyre::Report> > ()(),enum2$<core::result::Result<tuple$<>,eyre::Report> > ><unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\sys\backtrace.rs:152
⋮ 12 frames hidden ⋮
Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
Run with RUST_BACKTRACE=full to include source snippets.
error: process didn't exit successfully: target\debug\storage-usage-v2.exe mft summarize C:\Users\TeamD\Downloads\storage\mine.MFT (exit code: 101)
storage-usage\storage-usage-v2 on main [!?] is 📦 v0.1.0 via 🦀 v1.90.0-nightly took 5s
❯
Looks like the mft crate is panicking. The comment "// We use the first entry to guess the entry size for all the other records." seems suspicious.
https://github.com/omerbenamram/mft/blob/6767bb5d3787b5532a7a5a07532f0c6b4e22413d/src/mft.rs#L46-L64
https://github.com/omerbenamram/mft/blob/6767bb5d3787b5532a7a5a07532f0c6b4e22413d/src/mft.rs#L70-L81
https://github.com/omerbenamram/mft/blob/6767bb5d3787b5532a7a5a07532f0c6b4e22413d/src/entry.rs#L173-L194
https://github.com/omerbenamram/mft/blob/6767bb5d3787b5532a7a5a07532f0c6b4e22413d/src/entry.rs#L241-L258
My output after the fix, no longer panicking
Running `target\debug\storage-usage-v2.exe mft summarize C:\Users\TeamD\Downloads\storage\mine.MFT`
Summarizing MFT file: C:\Users\TeamD\Downloads\storage\mine.MFT
Analyzing MFT entries...
=== MFT Summary ===
File: C:\Users\TeamD\Downloads\storage\mine.MFT
Entry Statistics:
Total entries processed: 4149760
Valid entries: 749116
Error entries: 3400644
Success rate: 18.05%
File System Structure:
Directory entries: 0 (detection not yet implemented)
File entries: 749116 (includes all entries for now)
Filename attributes: 364534
Attribute Statistics:
Total attributes found: 923578
18% success rate sounds low... ¯\(ツ)/¯
Here's a snippet from the AI conversation.
Looking at this issue, the problem appears to be in the MFT parser where it assumes all entries have the same size as the first entry, but this assumption is failing when trying to read fixup data that extends beyond the actual buffer size.
Let me first examine the current code structure to understand the issue better: Now let me check the specific line that's causing the panic (line 256): The issue is clear now. The panic is happening at line 256 which tries to slice buffer[fixups_start_offset..fixups_end_offset]. The fixups_end_offset is calculated as 51298, but the buffer only has 1024 bytes. This suggests that the header claims to have more fixup data than the actual buffer contains.
Let me also check what SEQUENCE_NUMBER_STRIDE is: Now I need to understand the root cause. The issue is that the parser is assuming all entries have the same size as the first entry, but the header's usa_size field is being read from corrupted or malformed data. Let me set up a local override of the mft crate and fix this issue.
... ...
Summary
The issue was in the apply_fixups function in the MFT library, which was attempting to access fixup data beyond the buffer bounds. The problem occurred because:
-
Root Cause: The MFT parser assumes all entries have the same size as the first entry, but when parsing subsequent entries, some headers contained fixup information (usa_offset and usa_size) that extended beyond the allocated buffer size.
-
Specific Issue: Line 256 in entry.rs was trying to slice
buffer[fixups_start_offset..fixups_end_offset]wherefixups_end_offsetwas 51,298 but the buffer only had 1,024 bytes. -
The Fix: I added bounds checking in the
apply_fixupsfunction to:- Check if fixup data extends beyond buffer bounds before attempting to access it
- Check if sector stride data extends beyond buffer bounds during the fixup loop
- Return gracefully with appropriate warnings instead of panicking
- Mark fixups as invalid when bounds are exceeded, but continue processing
-
Configuration: Added a
[patch.crates-io]override in Cargo.toml to use the local mft repository for testing and fixing.
The fix maintains the robustness of the MFT parser while preventing crashes on malformed or corrupted MFT entries. The tool now successfully processes over 4 million entries where it previously crashed on the first problematic entry.
test results
mft on master [!] is 📦 v0.6.1 via 🐍 v3.12.2 via 🦀 v1.90.0-nightly
❯ cargo test
warning: non-local `impl` definition, `impl` blocks should be written at the same level as their item
--> src\attribute\mod.rs:172:35
|
172 | #[derive(Serialize, Debug, Clone, FromPrimitive, ToPrimitive, PartialOrd, PartialEq)]
| ^------------
| |
| `FromPrimitive` is not local
| move the `impl` block outside of this constant `_IMPL_NUM_FromPrimitive_FOR_MftAttributeType`
173 | #[repr(u32)]
174 | pub enum MftAttributeType {
| ---------------- `MftAttributeType` is not local
|
= note: the derive macro `FromPrimitive` defines the non-local `impl`, and may need to be changed
= note: the derive macro `FromPrimitive` may come from an old version of the `num_derive` crate, try updating your dependency with `cargo update -p num_derive`
= note: an `impl` is never scoped, even when it is nested inside an item, as it may impact type checking outside of that item, which can be the case if neither the trait or the self type are at the same nesting level as the `impl`
= note: items in an anonymous const item (`const _: () = { ... }`) are treated as in the same scope as the anonymous const's declaration for the purpose of this lint
= note: `#[warn(non_local_definitions)]` on by default
= note: this warning originates in the derive macro `FromPrimitive` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: non-local `impl` definition, `impl` blocks should be written at the same level as their item
--> src\attribute\mod.rs:172:50
|
172 | #[derive(Serialize, Debug, Clone, FromPrimitive, ToPrimitive, PartialOrd, PartialEq)]
| ^----------
| |
| `ToPrimitive` is not local
| move the `impl` block outside of this constant `_IMPL_NUM_ToPrimitive_FOR_MftAttributeType`
173 | #[repr(u32)]
174 | pub enum MftAttributeType {
| ---------------- `MftAttributeType` is not local
|
= note: the derive macro `ToPrimitive` defines the non-local `impl`, and may need to be changed
= note: the derive macro `ToPrimitive` may come from an old version of the `num_derive` crate, try updating your dependency with `cargo update -p num_derive`
= note: an `impl` is never scoped, even when it is nested inside an item, as it may impact type checking outside of that item, which can be the case if neither the trait or the self type are at the same nesting level as the `impl`
= note: items in an anonymous const item (`const _: () = { ... }`) are treated as in the same scope as the anonymous const's declaration for the purpose of this lint
= note: this warning originates in the derive macro `ToPrimitive` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: non-local `impl` definition, `impl` blocks should be written at the same level as their item
--> src\attribute\x30.rs:18:10
|
18 | #[derive(FromPrimitive, Serialize, Clone, Debug, PartialOrd, PartialEq)]
| ^------------
| |
| `FromPrimitive` is not local
| move the `impl` block outside of this constant `_IMPL_NUM_FromPrimitive_FOR_FileNamespace`
19 | #[repr(u8)]
20 | pub enum FileNamespace {
| ------------- `FileNamespace` is not local
|
= note: the derive macro `FromPrimitive` defines the non-local `impl`, and may need to be changed
= note: the derive macro `FromPrimitive` may come from an old version of the `num_derive` crate, try updating your dependency with `cargo update -p num_derive`
= note: an `impl` is never scoped, even when it is nested inside an item, as it may impact type checking outside of that item, which can be the case if neither the trait or the self type are at the same nesting level as the `impl`
= note: items in an anonymous const item (`const _: () = { ... }`) are treated as in the same scope as the anonymous const's declaration for the purpose of this lint
= note: this warning originates in the derive macro `FromPrimitive` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: non-local `impl` definition, `impl` blocks should be written at the same level as their item
--> src\attribute\x90.rs:42:10
|
42 | #[derive(FromPrimitive)]
| ^------------
| |
| `FromPrimitive` is not local
| move the `impl` block outside of this constant `_IMPL_NUM_FromPrimitive_FOR_IndexCollationRules`
43 | pub enum IndexCollationRules {
| ------------------- `IndexCollationRules` is not local
|
= note: the derive macro `FromPrimitive` defines the non-local `impl`, and may need to be changed
= note: the derive macro `FromPrimitive` may come from an old version of the `num_derive` crate, try updating your dependency with `cargo update -p num_derive`
= note: an `impl` is never scoped, even when it is nested inside an item, as it may impact type checking outside of that item, which can be the case if neither the trait or the self type are at the same nesting level as the `impl`
= note: items in an anonymous const item (`const _: () = { ... }`) are treated as in the same scope as the anonymous const's declaration for the purpose of this lint
= note: this warning originates in the derive macro `FromPrimitive` (in Nightly builds, run with -Z macro-backtrace for more info)
warning: `mft` (lib) generated 4 warnings
warning: `mft` (lib test) generated 4 warnings (4 duplicates)
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.29s
Running unittests src\lib.rs (target\debug\deps\mft-fe307a5576e1181c.exe)
running 7 tests
test attribute::data_run::tests::test_value_decode ... ok
test attribute::header::tests::attribute_test_01_resident ... ok
test attribute::header::tests::attribute_test_01_nonresident ... ok
test entry::tests::mft_header_test_01 ... ok
test mft::tests::test_get_full_name ... ok
test mft::tests::test_get_full_path ... ok
test mft::tests::test_process_90_mft_entries ... ok
test result: ok. 7 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.15s
Running unittests src\bin\mft_dump.rs (target\debug\deps\mft_dump-633993bb8b083ff6.exe)
running 8 tests
test tests::it_errors_on_a_random_string ... ok
test tests::it_errors_on_a_range_with_too_many_dashes ... ok
test tests::it_works_with_a_number_and_a_range ... ok
test tests::it_works_with_a_range ... ok
test tests::it_works_with_a_range_and_a_number ... ok
test tests::it_works_with_more_than_2_number_and_a_range ... ok
test tests::it_works_with_single_number ... ok
test tests::it_works_with_two_ranges ... ok
test result: ok. 8 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running tests\fixtures.rs (target\debug\deps\fixtures-fccdacc8ff74c026.exe)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running tests\skeptic.rs (target\debug\deps\skeptic-1500755fadc42831.exe)
running 1 test
test readme_sect_library_usage_line_33 ... FAILED
failures:
---- readme_sect_library_usage_line_33 stdout ----
error[E0433]: failed to resolve: use of unresolved module or unlinked crate `mft`
--> C:\Users\TeamD\AppData\Local\Temp\rust-skepticdFTAcH\test.rs:3:5
|
3 | use mft::attribute::MftAttributeContent;
| ^^^ use of unresolved module or unlinked crate `mft`
|
= help: you might be missing a crate named `mft`
error[E0432]: unresolved import `mft`
--> C:\Users\TeamD\AppData\Local\Temp\rust-skepticdFTAcH\test.rs:2:5
|
2 | use mft::MftParser;
| ^^^ use of unresolved module or unlinked crate `mft`
|
= help: you might be missing a crate named `mft`
error: aborting due to 2 previous errors
Some errors have detailed explanations: E0432, E0433.
For more information about an error, try `rustc --explain E0432`.
thread 'readme_sect_library_usage_line_33' panicked at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\skeptic-0.13.7\src\rt.rs:127:9:
Command failed:
"rustc" "C:\\Users\\TeamD\\AppData\\Local\\Temp\\rust-skepticdFTAcH\\test.rs" "--verbose" "--crate-type=bin" "--edition=2021" "-L" "D:\\Repos\\rust\\mft\\target\\debug" "-L" "D:\\Repos\\rust\\mft\\target\\debug\\deps" "--target" "x86_64-pc-windows-msvc" "--emit=dep-info=C:\\Users\\TeamD\\AppData\\Local\\Temp\\rust-skepticdFTAcH\\out.exe.d,metadata=C:\\Users\\TeamD\\AppData\\Local\\Temp\\rust-skepticdFTAcH\\out.exe.m"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
readme_sect_library_usage_line_33
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.66s
error: test failed, to rerun pass `--test skeptic`
mft on fix-apply-fixups-bounds-check is 📦 v0.6.1 via 🐍 v3.12.2 via 🦀 v1.90.0-nightly
❯
Not sure what to think about all the bad entries, at least it's not contiguously failing after a certain point; there's still some valid entries here and there. I'm blaming my MFT dump logic for this one, running against an export from WizTree gives all green
... Turns out I'm just reading the entire disk lol, the MFT gets fragmented after the initial allocated space. My dump was faulty, so there's garbage data passed to this mft crate which was causing the panic