rust-minidump
rust-minidump copied to clipboard
Add the ability to sanitize minidumps
Often when we see a crash in a video driver the vendor will ask for will ask for a minidump so they can diagnose the problem. However, we can't release the minidumps that we collect to the vendor without permission from the users which is slow and inconvenient to gather. It would be really nice if we had a tool that could take a minidump and write 0 to every address on the stack that was not used for stackwalking. We could then presumably pass the minidump on to vendors and they could still load it in WinDBG etc.
This crate doesn't currently have any facilities for writing or modifying minidumps, but if you read a minidump into memory and iterate over the contained memory regions, they provide access to their memory contents as byte slices borrowed from the minidump bytes, so it shouldn't be awful to modify them and overwrite the bytes in-place in the dump.
Note that the MiniDumpWriteDump API's MINIDUMP_TYPE parameter has a few values that are intended for this use case: filtering out nonessential data while writing the dump:
MiniDumpFilterMemory: Stack and backing store memory written to the minidump file should be filtered to remove all but the pointer values necessary to reconstruct a stack trace.MiniDumpFilterModulePaths: Filter module paths for information such as user names or important directories. This option may prevent the system from locating the image file and should be used only in special situations.
With this advice, I've created a program that does this: https://github.com/jrmuizel/minidump-filter
Nice! That said, it doesn't quite look like you're preserving your goal of preserving bytes "used for stackwalking". You're deleting anything that's:
- used for non-trivial CFI (saved stack sizes and arbitrary pushed registers but those are more sketchy because they can contain fragments of user data)
- a frame pointer
If you just want to save the frame pointers, you can also remember which ranges were stack frames and preserve pointers to those addresses too.
If you want to get really clever then you can:
- run minidump-processor to get the stack unwind
- iterate over each stack frame's memory (from the base of the frame) and save the first copy of each value that matches each register-in-the-caller on the assumption that those are callee-save (use a HashSet and remove a value from the set whenever you encounter it, to avoid any cute shenanigans with a special value getting repeated throughout the stackframe).
This is probably overkill and preserving all the pointers into the stack is probably 99% adequate.
I added preservation of self references to memory regions and that should fix frame pointers being used during unwinding. This fix is sufficient to get Visual Studio to unwind the same way in the filtered minidump as it does in the original.
I doubt this is a concern for your usecase, but just so you know it's a thing, apple arm64 needs some extra handling due to ptr auth:
https://github.com/rust-minidump/rust-minidump/blob/ebd5dd0f50a6d2f86286181e268a73d44ede7f58/minidump-processor/src/stackwalker/arm64_old.rs#L249-L261
FWIW sentry (inside relay) also scrubs minidumps, we did not go through the effort of ensuring we can still stackwalk afterwards and instead let customers scurb/mask whatever they want in given regions (we don't really want to start stackwalking in this stage of the processing pipeline) so this is much more rudimentary than what's described here.
@jrmuizel Nice! If you wanted to compare your implementation against Microsoft's, WinDbg's .dump command will let you select those options. You could debug a process, generate a standard minidump with .dump /m not-redacted.dmp, then generate a dump from the same process state with those redactions using .dump /mrR redacted.dmp.
@flub interesting, thanks for the pointer!
I'm not actually sure what heuristics MiniDumpFilterMemory uses, so some comparison testing would be interesting.