Ghidra-Cpp-Class-Analyzer icon indicating copy to clipboard operation
Ghidra-Cpp-Class-Analyzer copied to clipboard

Duplicated data types with GNU Demangler

Open kevinhartman opened this issue 3 years ago • 6 comments

Hi @astrelsky!

Thanks a lot for your work on this project. It's very helpful.

I am somewhat new to Ghidra and RE in general, but I seem to have gotten myself into a odd situation. I seem to have duplicate data types for each class recognized by this analyzer and what I'm guessing is the GNU Demangler analyzer.

Perhaps I should have disabled that analysis, and used this one only?

Any tips for deduplicating? It seems that some of the disassembled functions use the Demangler types, while others use those from RTTI.

Thanks!

kevinhartman avatar Jul 23 '22 05:07 kevinhartman

Unfortunately this is a Ghidra problem. I'm actually logging every single time Ghidra returns the wrong class from the method it uses to get the type for a thiscall function. You can open an issue in the Ghidra repo and hope they finally do something about it.

I recommend leaving the demangler analyzer on. I'll see if I can hack up a script to help deal with this but it will end up being very slow because of how the datatype manager handles type replacement.

astrelsky avatar Jul 23 '22 10:07 astrelsky

Thanks! A script would be helpful.

I'm happy to file a bug on Ghidra as well, but it's not clear to me what's going wrong and where. I assumed the demangler analyzer created its own types (in the Demanagler folder in the data type manager), and is using those with the parameters of functions that it similarly created.

kevinhartman avatar Jul 24 '22 16:07 kevinhartman

Thanks! A script would be helpful.

I'm happy to file a bug on Ghidra as well, but it's not clear to me what's going wrong and where. I assumed the demangler analyzer created its own types (in the Demanagler folder in the data type manager), and is using those with the parameters of functions that it similarly created.

Suppose you have two classes namespaceA::Object and namespaceB::Object. If there is no demangler folder, VariableUtilities.findOrCreateClassStruct will return /namespaceA/Object for both of them in their __thiscall functions because that is the first Object it will find.

astrelsky avatar Jul 24 '22 19:07 astrelsky

Hmm. Would DataTypeManager::replaceDataType work properly as a means to deduplicate? Wondering if it is affected by this bug.

https://ghidra.re/ghidra_docs/api/ghidra/program/model/data/DataTypeManager.html#replaceDataType(ghidra.program.model.data.DataType,ghidra.program.model.data.DataType,boolean)

kevinhartman avatar Jul 25 '22 13:07 kevinhartman

Hmm. Would DataTypeManager::replaceDataType work properly as a means to deduplicate? Wondering if it is affected by this bug.

https://ghidra.re/ghidra_docs/api/ghidra/program/model/data/DataTypeManager.html#replaceDataType(ghidra.program.model.data.DataType,ghidra.program.model.data.DataType,boolean)

It would work in your case where the datatype being replaced is trash and is the empty placeholder created by the demangler. This is what I would use in a script to fix your situation. However there is currently nothing that can be done about the case with two different valid class structures.

astrelsky avatar Jul 25 '22 15:07 astrelsky

Got something working (for the placeholder case). It only works if the replacement type has the same name, though this limitation is simply because it makes my workflow easier (I can use a simple dropdown when prompting for the replacement type).

Posting for others. https://gist.github.com/kevinhartman/a4dce3beadfe08dc7fd3fca61fb40567

kevinhartman avatar Jul 26 '22 06:07 kevinhartman