demumble icon indicating copy to clipboard operation
demumble copied to clipboard

Demangle RTTI class names

Open stevemk14ebr opened this issue 6 years ago • 6 comments
trafficstars

RTTI class names that start with .?AV or .?AU (class/struct) are not demangled. This can be fixed by stripping the symbol prefix for RTTI and replacing it with the C++ class prefix.

Example: .?AVCNetMidLayer@@ -> ??0CNetMidLayer@@QAE@XZ

From: https://reverseengineering.stackexchange.com/questions/20516/how-can-i-demangle-the-name-in-an-rtti-type-descriptor and https://github.com/REhints/HexRaysCodeXplorer/blob/5be89aa1d32eeaefb099b838ee5622200eb8a2e9/src/HexRaysCodeXplorer/ObjectExplorer.h#L80

stevemk14ebr avatar Sep 16 '19 14:09 stevemk14ebr

Thanks for the report.

I think the .AV... bit is written by mangleCXXRTTIName() (currently at http://llvm-cs.pcc.me.uk/tools/clang/lib/AST/MicrosoftMangle.cpp#3129), which mangles the contents of the RTTI descriptor (i.e. it's not really a symbol, but data in the symbol whose name is computed by mangleCXXRTTI() a bit further down in that file).

undname also can't demangle this (but there's no reason not to do better, of course):

C:\src\demumble>undname .?AVCNetMidLayer@@
Microsoft (R) C++ Name Undecorator
Copyright (C) Microsoft Corporation. All rights reserved.

Undecoration of :- ".?AVCNetMidLayer@@"
is :- ".?AVCNetMidLayer@@"

For Itanium symbols, we're able to demangle both name of the struct and the contents:

C:\src\llvm-mono>..\demumble\demumble.exe _ZTI1H
typeinfo for H

C:\src\llvm-mono>..\demumble\demumble.exe _ZTSFviE
typeinfo name for void (int)

This suggests we should support this for the Microsoft ABI too.

nico avatar Sep 17 '19 14:09 nico

One problem is that this makes it a bit hard to detect a mangled string. At the moment, we can look for "?" as prefix on Win and for "_Z" on Itanium.

With this, the prefix on Win can be ".?..." for a tag type (typeof(MyClass)), ".$$B..." for an array type (typeof(MyClass[4])), ".N" (and other built-in type codes) for a built-in type (typeof(double)) -- we basically have to look for a mangled type after every period in the input.

nico avatar Sep 17 '19 15:09 nico

With ad8745b2219 (not production quality) applied locally:

$ buildmac/demumble ".?AVCNetMidLayer@@"
typeinfo name for class CNetMidLayer

nico avatar Sep 20 '19 03:09 nico

Upstream bit: https://reviews.llvm.org/D67851

nico avatar Sep 20 '19 18:09 nico

Thank you for working on this

stevemk14ebr avatar Sep 20 '19 21:09 stevemk14ebr

Trunk now demangles rtti descriptor names when you pass them directly:

$ ./demumble .?AVCNetMidLayer@@
class CNetMidLayer `RTTI Type Descriptor Name'

Adding it in streaming mode (echo .?AVCNetMidLayer@@ | ./demumble) is a bit tricky to do since . is such a common character. If I do what's in ad8745b221916 , then echo ._Z1fv | ./demumble goes from .f() to ._Z1fv because the . now triggers an MS demangling attempt (because "_Z1fv" by themselves are all valid ms mangling chars), and on demangling fail demumble currently prints the whole candidate string and advances.

I could make it so that on demangling fail, we consume just one char instead or something.

Not supporting this in streaming mode at all isn't super unreasonable either imho, since that's what we do for itanium type manglings ("Pi").

But eventually I'll probably want to do the smarter backtracking -- it should fire rarely enough that it shouldn't affect perf much.

nico avatar Sep 23 '19 13:09 nico