ghidra
ghidra copied to clipboard
(Proof of concept) Width 4 boolean
It is not uncommon for a program to clear the high bits of registers for booleans. For example, the following are (separate instructions) from a MIPS (32 bit, big endian) binary which precede a return, illustrating different ways the function returns true and false
Callers can access the full register and treat it as a boolean value, which the decompiler does not know about.
This leads to decompilation like:
This PR introduces a bool4 (Bool4DataType extends BooleanDataType) which tries to behave as much like a width4 boolean as possible.
After: (changing return type to bool4)
It doesn't decompile better in the base function and might not completely capture the meaning of a "width 4 boolean" , especially where the decompiler is concerned (for example, the if (bVar3 != false) {...} line); This PR is intended to be a proof of concept rather than used as is
I dabbled with modifying the decompiler (Varnode::isBooleanValue, ActionFuncLink::funcLinkOutput and FuncCallSpecs::commitNewOutputs) but with little success. For example, I removed the following check in ActionFuncLink::funcLinkOutput
ProtoParameter *outparam = fc->getOutput();
Datatype *outtype = outparam->getType();
if (outtype->getMetatype() != TYPE_VOID) {
int4 sz = outparam->getSize();
if (sz == 1 && outtype->getMetatype() == TYPE_BOOL && data.isTypeRecoveryOn())
data.opMarkCalculatedBool(callop);
ProtoParameter *outparam = fc->getOutput();
Datatype *outtype = outparam->getType();
if (outtype->getMetatype() != TYPE_VOID) {
int4 sz = outparam->getSize();
if (outtype->getMetatype() == TYPE_BOOL && data.isTypeRecoveryOn())
data.opMarkCalculatedBool(callop);
but obtained results like:
bool4 bVar3;
...
bVar3 = bar(...);
...
if ((BADTYPE) bVar3)) {...}
Try adding <aggressivetrim signext="true"/> to the mips32be.cspec (as seen in mips64_32_n32.cspec) and then adding extension="sign"/> to the v0 output pentry. I think this will work assuming a bool is still treated as a number. This problem exists in almost every compiler spec (specifically bools) I've seen. It's either that compilers are inconsistent about it or it's the old stupid #define TRUE 1 #define FALSE 0
This should not be done by adding a larger datatype. It is not uncommon for smaller types (e.g., short) to move in/out of larger registers. We will look into the decompiler's handling of this situation, but we cannot accept this PR. The concern should be entered as an Issue and not a PR. The datatype size is a function of the ABI/Compiler and not the register size.
OK, noted. I will give the aggressivetrim suggestion a go too. I have some ideas for a different datatype, specifically, an int4 that displays like a string (useful for functions like the Windows ExAllocatePoolWithTag API) but i'll leave that to a different PR, if that sounds viable
We do have instance settings that can be applied which will influence rendering within the listing - although decompiler will not respect. There are also a selective convert actions within the listing and decompiler that than can be used to render a numeric constant in different formats (although there appears to be a disrepency in how the char sequence case is handled).
I would try to avoid adding a datatype ince you are likely to run into issues with the decompiler. Trying to have it behave as an integer and render as a string. In the end all primitive types are conveyed to the decompiler as a core metatype.