ghidra
ghidra copied to clipboard
Rust language analysis
Is your feature request related to a problem? Please describe. As far as I know, there is no reverse engineering framework that really supports programs written in Rust, which is becoming a pretty popular language. I don't consider this a priority as it is far less common than C to this day, but it would be something nice to have.
Describe the solution you'd like Right now, the Rust standard library and types are unrecognized and renders the decompiler pretty useless as it only really handles conditionnal expressions and loops well. Also, most Rust program crash handling uses unwinding which stores the real name of every functions somewhere in the binary, while the names recognized by Ghidra looks seemingly random. For someone familliar with the Rust compiler's inner working, it should be relatively easy to do and, by itself, be extremely helpful.
How this relates with ghidra?
The request is to add some support for rust analysis in ghidra. Mostly, demangling the functions names as, right now, ghidra detects them as namespaces instead
Rust has as its tool the ability to make functions into hash code. This explains the process in detail: Here
But, it would be possible to extract some information from it through reverse engineering, and can highlight the reference by function as a heuristic result?
I've written a Ghidra Python script for demangling Rust symbols: https://gist.github.com/str4d/e541f4c28e2bca80d222434ac1a204f4
Currently it only supports the "legacy" mangling format, but that's still the default format as RFC 2603 hasn't yet been stabilised: https://github.com/rust-lang/rust/issues/60705
@str4d great work!
It would also be nice to handle rust strings somehow. Rust doesn't use nul terminated strings like C (in fact it allows arbitrary nul bytes inside a string). Rust strings are represented using a combination of a pointer and a length. The current rust calling convention passes both as separate arguments according to the native C calling convention so a heuristic could maybe check if one argument is a pointer to something that looks like a string followed by an argument that looks like a length. As a sanity check it could then check if the resulting string is valid UTF-8 as in Rust non-UTF-8 strings are UB.
Update: my script now supports most of the v0 mangling format. The main things missing are const generics (which are blocked on a pending change to the RFC) and decompression of backrefs (which I'll add next).
It appears that demangling Rust strings exists in Ghidra 11.0, both for the legacy and v0 format: https://github.com/NationalSecurityAgency/ghidra/tree/fa8eff4d33f6323bcd8e26615d5a362046bb2756/Ghidra/Features/Base/src/main/java/ghidra/app/plugin/core/analysis/rust/demangler.