mft icon indicating copy to clipboard operation
mft copied to clipboard

fix: Add bounds checking to apply_fixups to prevent panic on entry parsing

Open TeamDman opened this issue 5 months ago • 1 comments

  • Add bounds check before accessing fixup data to prevent index out of bounds panic
  • Add bounds check for sector stride data during fixup processing
  • Return gracefully with appropriate warnings instead of panicking
  • Mark fixups as invalid when bounds are exceeded but continue processing
  • Fixes panic when usa_offset and usa_size extend beyond buffer length

This resolves issues with malformed MFT entries that claim to have more fixup data than the actual buffer contains, allowing the parser to continue processing other entries instead of crashing.

Disclaimer: This fix was performed with the assistance of AI


I am working on some tooling to read the MFT I'm dumping from my disk. Relevant file from my side: https://github.com/TeamDman/storage-usage/blob/e1c0493cd4b862d0d2c6c54958a432c6609a118e/storage-usage-v2/src/mft_summarize.rs

❯ cargo run mft summarize C:\Users\TeamD\Downloads\storage\mine.MFT
Finished dev profile [unoptimized + debuginfo] target(s) in 0.27s
Running target\debug\storage-usage-v2.exe mft summarize C:\Users\TeamD\Downloads\storage\mine.MFT
Summarizing MFT file: C:\Users\TeamD\Downloads\storage\mine.MFT

Analyzing MFT entries...
The application panicked (crashed).
Message: range end index 51298 out of range for slice of length 1024
Location: G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\entry.rs:256

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⋮ 12 frames hidden ⋮
13: core::slice::index::slice_end_index_len_fail::do_panic::runtime<unknown>
at /rustc/da58c051315268a197ce280f6ba07bbd03c66535/library\core\src\panic.rs:218
14: core::slice::index::slice_end_index_len_fail<unknown>
at /rustc/da58c051315268a197ce280f6ba07bbd03c66535/library\core\src\panic.rs:223
15: core::slice::index::impl$4::index<unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\slice\index.rs:436
16: core::slice::index::impl$0::index<unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\slice\index.rs:17
17: mft::entry::MftEntry::apply_fixups<unknown>
at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\entry.rs:256
18: mft::entry::MftEntry::from_buffer<unknown>
at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\entry.rs:184
19: mft::mft::MftParser<std::io::buffered::bufreader::BufReaderstd::fs::File >::get_entry<std::io::buffered::bufreader::BufReaderstd::fs::File ><unknown>
at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\mft.rs:80
20: mft::mft::impl$2::iter_entries::closure$0<std::io::buffered::bufreader::BufReaderstd::fs::File ><unknown>
at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\mft-0.6.1\src\mft.rs:87
21: core::ops::function::impls::impl$4::call_once<unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:305
22: enum2$<core::option::Option<u64> >::map<unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\option.rs:1146
23: core::iter::adapters::map::impl$2::next<enum2$<core::result::Result<mft::entry::MftEntry,enum2$mft::err::Error > >,core::ops::range::Range<u64>,mft::mft::impl$2::iter_entries::closure_env$0<std::io::buffered::bufreader::BufReaderstd::fs::File > ><unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\iter\adapters\map.rs:107
24: core::iter::adapters::enumerate::impl$1::next<core::iter::adapters::map::Map<core::ops::range::Range<u64>,mft::mft::impl$2::iter_entries::closure_env$0<std::io::buffered::bufreader::BufReaderstd::fs::File > > ><unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\iter\adapters\enumerate.rs:80
25: storage_usage_v2::mft_summarize::summarize_mft_file<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\mft_summarize.rs:29
26: storage_usage_v2::cli::mft_summarize_action::MftSummarizeArgs::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\mft_summarize_action.rs:24
27: enum2$<storage_usage_v2::cli::mft_action::MftAction>::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\mft_action.rs:42
28: storage_usage_v2::cli::mft_action::MftArgs::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\mft_action.rs:19
29: enum2$<storage_usage_v2::cli::action::Action>::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\action.rs:17
30: storage_usage_v2::cli::Cli::run<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\cli\mod.rs:31
31: storage_usage_v2::main<unknown>
at D:\Repos\rust\storage-usage\storage-usage-v2\src\main.rs:15
32: core::ops::function::FnOnce::call_once<enum2$<core::result::Result<tuple$<>,eyre::Report> > ()(),tuple$<> ><unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:250
33: std::sys::backtrace::__rust_begin_short_backtrace<enum2$<core::result::Result<tuple$<>,eyre::Report> > ()(),enum2$<core::result::Result<tuple$<>,eyre::Report> > ><unknown>
at G:\Programming\Caches\RUSTUP_HOME\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\sys\backtrace.rs:152
⋮ 12 frames hidden ⋮

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
Run with RUST_BACKTRACE=full to include source snippets.
error: process didn't exit successfully: target\debug\storage-usage-v2.exe mft summarize C:\Users\TeamD\Downloads\storage\mine.MFT (exit code: 101)
storage-usage\storage-usage-v2 on  main [!?] is 📦 v0.1.0 via 🦀 v1.90.0-nightly took 5s
❯

Looks like the mft crate is panicking. The comment "// We use the first entry to guess the entry size for all the other records." seems suspicious.

https://github.com/omerbenamram/mft/blob/6767bb5d3787b5532a7a5a07532f0c6b4e22413d/src/mft.rs#L46-L64

https://github.com/omerbenamram/mft/blob/6767bb5d3787b5532a7a5a07532f0c6b4e22413d/src/mft.rs#L70-L81

https://github.com/omerbenamram/mft/blob/6767bb5d3787b5532a7a5a07532f0c6b4e22413d/src/entry.rs#L173-L194

https://github.com/omerbenamram/mft/blob/6767bb5d3787b5532a7a5a07532f0c6b4e22413d/src/entry.rs#L241-L258

My output after the fix, no longer panicking

     Running `target\debug\storage-usage-v2.exe mft summarize C:\Users\TeamD\Downloads\storage\mine.MFT`
Summarizing MFT file: C:\Users\TeamD\Downloads\storage\mine.MFT

Analyzing MFT entries...

=== MFT Summary ===
File: C:\Users\TeamD\Downloads\storage\mine.MFT

Entry Statistics:
  Total entries processed: 4149760
  Valid entries: 749116
  Error entries: 3400644
  Success rate: 18.05%

File System Structure:
  Directory entries: 0 (detection not yet implemented)
  File entries: 749116 (includes all entries for now)
  Filename attributes: 364534

Attribute Statistics:
  Total attributes found: 923578

18% success rate sounds low... ¯\(ツ)

Here's a snippet from the AI conversation.


Looking at this issue, the problem appears to be in the MFT parser where it assumes all entries have the same size as the first entry, but this assumption is failing when trying to read fixup data that extends beyond the actual buffer size.

Let me first examine the current code structure to understand the issue better: Now let me check the specific line that's causing the panic (line 256): The issue is clear now. The panic is happening at line 256 which tries to slice buffer[fixups_start_offset..fixups_end_offset]. The fixups_end_offset is calculated as 51298, but the buffer only has 1024 bytes. This suggests that the header claims to have more fixup data than the actual buffer contains.

Let me also check what SEQUENCE_NUMBER_STRIDE is: Now I need to understand the root cause. The issue is that the parser is assuming all entries have the same size as the first entry, but the header's usa_size field is being read from corrupted or malformed data. Let me set up a local override of the mft crate and fix this issue.

... ...

Summary

The issue was in the apply_fixups function in the MFT library, which was attempting to access fixup data beyond the buffer bounds. The problem occurred because:

  1. Root Cause: The MFT parser assumes all entries have the same size as the first entry, but when parsing subsequent entries, some headers contained fixup information (usa_offset and usa_size) that extended beyond the allocated buffer size.

  2. Specific Issue: Line 256 in entry.rs was trying to slice buffer[fixups_start_offset..fixups_end_offset] where fixups_end_offset was 51,298 but the buffer only had 1,024 bytes.

  3. The Fix: I added bounds checking in the apply_fixups function to:

    • Check if fixup data extends beyond buffer bounds before attempting to access it
    • Check if sector stride data extends beyond buffer bounds during the fixup loop
    • Return gracefully with appropriate warnings instead of panicking
    • Mark fixups as invalid when bounds are exceeded, but continue processing
  4. Configuration: Added a [patch.crates-io] override in Cargo.toml to use the local mft repository for testing and fixing.

The fix maintains the robustness of the MFT parser while preventing crashes on malformed or corrupted MFT entries. The tool now successfully processes over 4 million entries where it previously crashed on the first problematic entry.

test results
mft on  master [!] is 📦 v0.6.1 via 🐍 v3.12.2 via 🦀 v1.90.0-nightly
❯ cargo test
warning: non-local `impl` definition, `impl` blocks should be written at the same level as their item
   --> src\attribute\mod.rs:172:35
    |
172 | #[derive(Serialize, Debug, Clone, FromPrimitive, ToPrimitive, PartialOrd, PartialEq)]
    |                                   ^------------
    |                                   |
    |                                   `FromPrimitive` is not local
    |                                   move the `impl` block outside of this constant `_IMPL_NUM_FromPrimitive_FOR_MftAttributeType`
173 | #[repr(u32)]
174 | pub enum MftAttributeType {
    |          ---------------- `MftAttributeType` is not local
    |
    = note: the derive macro `FromPrimitive` defines the non-local `impl`, and may need to be changed
    = note: the derive macro `FromPrimitive` may come from an old version of the `num_derive` crate, try updating your dependency with `cargo update -p num_derive`
    = note: an `impl` is never scoped, even when it is nested inside an item, as it may impact type checking outside of that item, which can be the case if neither the trait or the self type are at the same nesting level as the `impl`
    = note: items in an anonymous const item (`const _: () = { ... }`) are treated as in the same scope as the anonymous const's declaration for the purpose of this lint
    = note: `#[warn(non_local_definitions)]` on by default
    = note: this warning originates in the derive macro `FromPrimitive` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: non-local `impl` definition, `impl` blocks should be written at the same level as their item
   --> src\attribute\mod.rs:172:50
    |
172 | #[derive(Serialize, Debug, Clone, FromPrimitive, ToPrimitive, PartialOrd, PartialEq)]
    |                                                  ^----------
    |                                                  |
    |                                                  `ToPrimitive` is not local
    |                                                  move the `impl` block outside of this constant `_IMPL_NUM_ToPrimitive_FOR_MftAttributeType`
173 | #[repr(u32)]
174 | pub enum MftAttributeType {
    |          ---------------- `MftAttributeType` is not local
    |
    = note: the derive macro `ToPrimitive` defines the non-local `impl`, and may need to be changed
    = note: the derive macro `ToPrimitive` may come from an old version of the `num_derive` crate, try updating your dependency with `cargo update -p num_derive`
    = note: an `impl` is never scoped, even when it is nested inside an item, as it may impact type checking outside of that item, which can be the case if neither the trait or the self type are at the same nesting level as the `impl`
    = note: items in an anonymous const item (`const _: () = { ... }`) are treated as in the same scope as the anonymous const's declaration for the purpose of this lint
    = note: this warning originates in the derive macro `ToPrimitive` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: non-local `impl` definition, `impl` blocks should be written at the same level as their item
  --> src\attribute\x30.rs:18:10
   |
18 | #[derive(FromPrimitive, Serialize, Clone, Debug, PartialOrd, PartialEq)]
   |          ^------------
   |          |
   |          `FromPrimitive` is not local
   |          move the `impl` block outside of this constant `_IMPL_NUM_FromPrimitive_FOR_FileNamespace`
19 | #[repr(u8)]
20 | pub enum FileNamespace {
   |          ------------- `FileNamespace` is not local
   |
   = note: the derive macro `FromPrimitive` defines the non-local `impl`, and may need to be changed
   = note: the derive macro `FromPrimitive` may come from an old version of the `num_derive` crate, try updating your dependency with `cargo update -p num_derive`
   = note: an `impl` is never scoped, even when it is nested inside an item, as it may impact type checking outside of that item, which can be the case if neither the trait or the self type are at the same nesting level as the `impl`
   = note: items in an anonymous const item (`const _: () = { ... }`) are treated as in the same scope as the anonymous const's declaration for the purpose of this lint
   = note: this warning originates in the derive macro `FromPrimitive` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: non-local `impl` definition, `impl` blocks should be written at the same level as their item
  --> src\attribute\x90.rs:42:10
   |
42 | #[derive(FromPrimitive)]
   |          ^------------
   |          |
   |          `FromPrimitive` is not local
   |          move the `impl` block outside of this constant `_IMPL_NUM_FromPrimitive_FOR_IndexCollationRules`
43 | pub enum IndexCollationRules {
   |          ------------------- `IndexCollationRules` is not local
   |
   = note: the derive macro `FromPrimitive` defines the non-local `impl`, and may need to be changed
   = note: the derive macro `FromPrimitive` may come from an old version of the `num_derive` crate, try updating your dependency with `cargo update -p num_derive`
   = note: an `impl` is never scoped, even when it is nested inside an item, as it may impact type checking outside of that item, which can be the case if neither the trait or the self type are at the same nesting level as the `impl`
   = note: items in an anonymous const item (`const _: () = { ... }`) are treated as in the same scope as the anonymous const's declaration for the purpose of this lint
   = note: this warning originates in the derive macro `FromPrimitive` (in Nightly builds, run with -Z macro-backtrace for more info)

warning: `mft` (lib) generated 4 warnings
warning: `mft` (lib test) generated 4 warnings (4 duplicates)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.29s
     Running unittests src\lib.rs (target\debug\deps\mft-fe307a5576e1181c.exe)

running 7 tests
test attribute::data_run::tests::test_value_decode ... ok
test attribute::header::tests::attribute_test_01_resident ... ok
test attribute::header::tests::attribute_test_01_nonresident ... ok
test entry::tests::mft_header_test_01 ... ok
test mft::tests::test_get_full_name ... ok
test mft::tests::test_get_full_path ... ok
test mft::tests::test_process_90_mft_entries ... ok

test result: ok. 7 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.15s

     Running unittests src\bin\mft_dump.rs (target\debug\deps\mft_dump-633993bb8b083ff6.exe)

running 8 tests
test tests::it_errors_on_a_random_string ... ok
test tests::it_errors_on_a_range_with_too_many_dashes ... ok
test tests::it_works_with_a_number_and_a_range ... ok
test tests::it_works_with_a_range ... ok
test tests::it_works_with_a_range_and_a_number ... ok
test tests::it_works_with_more_than_2_number_and_a_range ... ok
test tests::it_works_with_single_number ... ok
test tests::it_works_with_two_ranges ... ok

test result: ok. 8 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests\fixtures.rs (target\debug\deps\fixtures-fccdacc8ff74c026.exe)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests\skeptic.rs (target\debug\deps\skeptic-1500755fadc42831.exe)

running 1 test
test readme_sect_library_usage_line_33 ... FAILED

failures:

---- readme_sect_library_usage_line_33 stdout ----
error[E0433]: failed to resolve: use of unresolved module or unlinked crate `mft`
 --> C:\Users\TeamD\AppData\Local\Temp\rust-skepticdFTAcH\test.rs:3:5
  |
3 | use mft::attribute::MftAttributeContent;
  |     ^^^ use of unresolved module or unlinked crate `mft`
  |
  = help: you might be missing a crate named `mft`

error[E0432]: unresolved import `mft`
 --> C:\Users\TeamD\AppData\Local\Temp\rust-skepticdFTAcH\test.rs:2:5
  |
2 | use mft::MftParser;
  |     ^^^ use of unresolved module or unlinked crate `mft`
  |
  = help: you might be missing a crate named `mft`

error: aborting due to 2 previous errors

Some errors have detailed explanations: E0432, E0433.
For more information about an error, try `rustc --explain E0432`.

thread 'readme_sect_library_usage_line_33' panicked at G:\Programming\Caches\CARGO_HOME\registry\src\index.crates.io-1949cf8c6b5b557f\skeptic-0.13.7\src\rt.rs:127:9:
Command failed:
"rustc" "C:\\Users\\TeamD\\AppData\\Local\\Temp\\rust-skepticdFTAcH\\test.rs" "--verbose" "--crate-type=bin" "--edition=2021" "-L" "D:\\Repos\\rust\\mft\\target\\debug" "-L" "D:\\Repos\\rust\\mft\\target\\debug\\deps" "--target" "x86_64-pc-windows-msvc" "--emit=dep-info=C:\\Users\\TeamD\\AppData\\Local\\Temp\\rust-skepticdFTAcH\\out.exe.d,metadata=C:\\Users\\TeamD\\AppData\\Local\\Temp\\rust-skepticdFTAcH\\out.exe.m"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    readme_sect_library_usage_line_33

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.66s

error: test failed, to rerun pass `--test skeptic`
mft on  fix-apply-fixups-bounds-check is 📦 v0.6.1 via 🐍 v3.12.2 via 🦀 v1.90.0-nightly 
❯

TeamDman avatar Jul 05 '25 23:07 TeamDman

image

image

Not sure what to think about all the bad entries, at least it's not contiguously failing after a certain point; there's still some valid entries here and there. I'm blaming my MFT dump logic for this one, running against an export from WizTree gives all green

image

image

... Turns out I'm just reading the entire disk lol, the MFT gets fragmented after the initial allocated space. My dump was faulty, so there's garbage data passed to this mft crate which was causing the panic

TeamDman avatar Jul 06 '25 04:07 TeamDman