ghidra
ghidra copied to clipboard
x86 66h operand size override interfering with decompilers output
> Fixed by [a04f7fb](https://github.com/NationalSecurityAgency/ghidra/commit/a04f7fbb03e4e38c3f57228885a8387f8198e85c)
Just looked at the new decompilation provided by this fix, and although IT FIXES the first example in this ticket, it doesn't help with the second. In this decompilation, the last 3 lines
iVar1 = (int)g_pMemStk_9684.m_pvData + g_pMemStk_9684.m_nAvailable;
g_pMemStk_9684.m_nAvailable = g_pMemStk_9684.m_nAvailable + nReqSize;
return (LPVOID)CONCAT22(g_pMemStk_9684.m_pvData._2_2_,iVar1);
become
iVar2 = (int)g_pMemStk_9684.m_pvData + g_pMemStk_9684.m_nUsed;
uVar1 = (ulong)g_pMemStk_9684._4_4_ & 0xffff;
g_pMemStk_9684._4_4_ = (LPSTR)(uVar1 | (ulong)(g_pMemStk_9684.m_nUsed + nReqSize) << 0x10);
return (BYTE *)CONCAT22((int)uVar1,iVar2);
which is incorrect!
Here is the zipped xml for the function MemStk::Alloc(...)
.
Originally posted by @Wall-AF in https://github.com/NationalSecurityAgency/ghidra/issues/4533#issuecomment-1356246939
I believe this to be a combination of issues: i) relating to Ghidra's current inability to correctly fully understand far
pointers (ie being able to combine two addresses/registers/(regiser, address combo) and then treat that as a pointer); and ii) the understanding to ignore the high word after loading a 32-bit register followed by using only the 16-bit part (maybe!!).
I believe 19a63531c3e6d0062fe788b0e4151d33a4a88c18 fixes the problem you are seeing.
@caheckman Just double-checked and the result now is:
iVar1 = (int)g_pMemStk_9684.m_pvData + g_pMemStk_9684.m_nUsed;
g_pMemStk_9684.m_nUsed = g_pMemStk_9684.m_nUsed + nReqSize;
return (BYTE *)CONCAT22(g_pMemStk_9684.m_pvData._2_2_,iVar1);
better, but the sized pointer is still not fixed.
How do we gt Ghidra to produce output like:
g_pMemStk_9684.m_nUsed = g_pMemStk_9684.m_nUsed + nReqSize;
return &g_pMemStk_9684.m_pvData[g_pMemStk_9684.m_nUsed];
I'm getting strange results with this too: Assembly:
DgnPicTable_t*32 __cdecl16far FUN_10f0_169a(void)
assume DS = 0x1210
DgnPicTable_t DX:2,AX:2 <RETURN>
FUN_10f0_169a XREF[1]: FUN_1160_0341:1160:0342(c)
10f0:169a 66 8b 16 0 MOV EDX,dword ptr DS:[g_pPicTable_96ca+2]
cc 96
10f0:169f a1 ca 96 0 MOV AX,DS:[g_pPicTable_96ca] = 0000:0000
10f0:16a2 cb 0 RETF
Decompiles to:
/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */
DgnPicTable_t * __cdecl16far FUN_10f0_169a(void)
{
return (DgnPicTable_t *)CONCAT22((int)ram0x121096cc,g_pPicTable_96ca._0_2_);
}
The code bytes are:
66 8b 16 cc 96 a1 ca 96 cb
If I replace the first byte (66
) with NOP
(90
) I get this:
DgnPicTable_t * __cdecl16far FUN_10f0_169a(void)
{
return (DgnPicTable_t *)CONCAT22(g_pPicTable_96ca._2_2_,g_pPicTable_96ca._0_2_);
}
better, but still issues with far
pointer addresses (a different problem!).