ghidra
ghidra copied to clipboard
Type propagation through INDIRECT ops from call clobbers leads to wrong type inference
The decompiler's type propagation algorithm flows through INDIRECT pcode ops. For INDIRECT ops that represent call-clobbered registers, this can lead to incorrect types being inferred, even after the correct type was previously inferred for a varnode.
I originally ran into this analyzing a very large function in a FreeBSD kernel module, but managed to reduce it.
C source
typedef struct {
int foo;
} right_t;
typedef struct {
int bar;
int baz;
} wrong_t;
typedef struct {
right_t *right;
wrong_t *wrong;
} both_t;
int do_right(right_t *right);
void do_wrong(wrong_t *wrong);
int func(both_t *both) {
right_t *right = both->right; // RBX
wrong_t *wrong = both->wrong; // RBP
// MOV RDI, RBX
// CALL do_right
int ret = do_right(right);
// RCX INDIRECT RCX, <above CALL>
// RSI INDIRECT RSI, <above CALL>
// RDI INDIRECT RDI, <above CALL>
// ...
// Conditionally use the clobbered RDI, to prevent the INDIRECT appearing dead.
// In the unreduced test case, Ghidra thought RDI was live due to a partial overwrite + spill.
asm (
"test %1, %1\n\t"
"jnz %=f\n\t"
"mov %2, %0\n"
"%=:"
: "=D" (wrong)
: "r" (ret), "r" (wrong)
);
// Makes Ghidra infer wrong_t * for RDI
do_wrong(wrong);
return right->foo;
}
Disassembly
**************************************************************
* FUNCTION *
**************************************************************
int __stdcall func(both_t * both)
int EAX:4 <RETURN>
both_t * RDI:8 both
func XREF[4]: Entry Point(*), main:001011ac(c),
00102030, 001020c8(*)
0010113d 55 PUSH RBP
0010113e 53 PUSH RBX
0010113f 48 83 ec 08 SUB RSP,0x8
00101143 48 8b 1f MOV RBX,qword ptr [both->right]
00101146 48 8b 6f 08 MOV RBP,qword ptr [RDI + both->wrong]
0010114a 48 89 df MOV both,RBX
0010114d e8 e7 ff CALL do_right int do_right(right_t * right)
ff ff
00101152 85 c0 TEST EAX,EAX
00101154 75 03 JNZ LAB_00101159
00101156 48 89 ef MOV both,RBP
LAB_00101159 XREF[1]: 00101154(j)
00101159 e8 de ff CALL do_wrong void do_wrong(wrong_t * wrong)
ff ff
0010115e 8b 03 MOV EAX,dword ptr [RBX]
00101160 48 83 c4 08 ADD RSP,0x8
00101164 5b POP RBX
00101165 5d POP RBP
00101166 c3 RET
Decompilation
int func(both_t *both)
{
wrong_t *right;
wrong_t *pwVar1;
int iVar2;
wrong_t *wrong;
right = (wrong_t *)both->right;
pwVar1 = both->wrong;
wrong = right;
iVar2 = do_right((right_t *)right);
if (iVar2 == 0) {
wrong = pwVar1;
}
do_wrong(wrong);
return right->bar;
}
It ends up with wrong_t *right;
, even though we have DWARF info saying that the type is right_t *
. Turning on TYPEPROP_DEBUG
shows what's happening:
Debug log
Decompiling func
Type propagation pass - 0
#0x0:4 : int init
#0x0 : undefined8 init
#0x8 : long init
#0x55e52cfa2830 : undefined8 init
#0x55e52cfa2830 : undefined8 init
#0x55e52cfa2830 : undefined8 init
u0x00003100(0x00101146:11) : long init
u0x0000bf00:4(0x0010115e:27) : undefined4 init
u0x0000c000(0x00101143:f) : right_t * init
u0x0000c000(0x00101146:12) : undefined8 init
EAX(0x0010114d:17) : int init
EAX(0x00101166:47) : int init
RDI(i) : both_t * init
RDI(0x0010114d:4e) : undefined8 init
RDI(0x00101159:50) : wrong_t * init
ZF(0x00101152:1c) : bool init
RDI(0x0010114d:4e) : right_t * from RDI(0x0010114d:4e) = u0x0000c000(0x00101143:f) [] i0x0010114d:17(free) slot=0
u0x0000bf00:4(0x0010115e:27) : right_t from u0x0000bf00:4(0x0010115e:27) = *(ram,u0x0000c000(0x00101143:f)) slot=1
u0x00003100(0x00101146:11) : undefined8 * from u0x0000c000(0x00101146:12) = *(ram,u0x00003100(0x00101146:11)) slot=-1
u0x00003100(0x00101146:11) : wrong_t * *+8[both_t] from u0x00003100(0x00101146:11) = RDI(i) + #0x8 slot=0
u0x0000c000(0x00101146:12) : wrong_t * from u0x0000c000(0x00101146:12) = *(ram,u0x00003100(0x00101146:11)) slot=1
RDI(0x0010114d:4e) : wrong_t * from RDI(0x00101159:50) = RDI(0x0010114d:4e) ? u0x0000c000(0x00101146:12) slot=-1
u0x0000c000(0x00101143:f) : wrong_t * from RDI(0x0010114d:4e) = u0x0000c000(0x00101143:f) [] i0x0010114d:17(free) slot=-1
...
u0x0000c000(0x00101143:f)
is right_t *right;
. You can see it initially gets assigned the right type. Then later on RDI
gets typed as wrong_t *
, and that gets fed back through the INDIRECT.
The following patch seems to fix it. I'm not sure if it's the best way though. Let me know if I should PR it or if it should be solved another way.
diff --git a/Ghidra/Features/Decompiler/src/decompile/cpp/heritage.cc b/Ghidra/Features/Decompiler/src/decompile/cpp/heritage.cc
index c0347663e..d3fe8206a 100644
--- a/Ghidra/Features/Decompiler/src/decompile/cpp/heritage.cc
+++ b/Ghidra/Features/Decompiler/src/decompile/cpp/heritage.cc
@@ -1313,6 +1313,7 @@ void Heritage::guardCalls(uint4 fl,const Address &addr,int4 size,vector<Varnode
// We do not guard the call if the effect is "unaffected" or "reload"
if ((effecttype == EffectRecord::unknown_effect)||(effecttype == EffectRecord::return_address)) {
indop = fd->newIndirectOp(fc->getOp(),addr,size,0);
+ indop->setStopTypePropagation();
indop->getIn(0)->setActiveHeritage();
indop->getOut()->setActiveHeritage();
write.push_back(indop->getOut());
@@ -1323,6 +1324,7 @@ void Heritage::guardCalls(uint4 fl,const Address &addr,int4 size,vector<Varnode
}
else if (effecttype == EffectRecord::killedbycall) {
indop = fd->newIndirectCreation(fc->getOp(),addr,size,possibleoutput);
+ indop->setStopTypePropagation();
indop->getOut()->setActiveHeritage();
write.push_back(indop->getOut());
}
Here's the XML debug log of the reproducer: typebug.zip
Not sure I'm following this example. The value written to RDI via 10114a can be reread as a parameter for the call to do_wrong at 101159 if the branch at 101154 is taken. Or to say it another way, the decompiler sees a variable that is written as a pointer to right_t but is read as a pointer to wrong_t. From the decompiler's point of view, either label for the variable is valid.
The INDIRECT p-code op frequently propagates important data-type information, so it shouldn't be blocked in general.
Not sure I'm following this example. The value written to RDI via 10114a can be reread as a parameter for the call to do_wrong at 101159 if the branch at 101154 is taken.
Not really. The value written to RDI via 10114a is clobbered by the CALL at 10114d. If the branch at 101154 is taken then it passes garbage as a parameter to do_wrong(), but that garbage has nothing to do with RDI from 10114a.
The original, unreduced test case doesn't actually pass call-clobbered garbage to another function, but the data flow was complicated enough that Ghidra couldn't tell the garbage value was dead. If you're curious, it was something like
MOV DIL, 4
PUSH RDI
...
MOVZBL ECX, byte ptr [RSP]
The INDIRECT p-code op frequently propagates important data-type information, so it shouldn't be blocked in general.
Yeah I agree, but I think it's probably safe to do it only for call clobbers, which is what my proposed patch does (I think).