ghidra icon indicating copy to clipboard operation
ghidra copied to clipboard

Incorrect PCode for x86 pop Instructions with Stack Pointer Operands

Open sjcappella opened this issue 2 years ago • 0 comments

Describe the bug The PCode generated for the x86 pop instruction when the stack pointer, rsp, is used as an operand (both as a register operand and as the base register for addressing a destination operand in memory) does not reflect the behavior of actual hardware.

pop rsp - The Intel manual states:

The POP ESP instruction increments the stack pointer (ESP) before data at the old top of stack is written into the destination.

Presumably, with rsp being the destination, the value at the top of the stack before the increment will be stored in rsp. The PCode for this instruction first loads the value from rsp into rsp, then increments rsp. Consider the following instructions:

    push 0x11111111
    pop rsp

Ghidra will disassemble and lift these instructions to the following PCode:

        00401000 68 11 11 11 11      PUSH       0x11111111
                                                           $U41580:8 = COPY 0x11111111:8
                                                           RSP = INT_SUB RSP, 8:8
                                                           STORE ram(RSP), $U41580:8

        00401005 5c                         POP        RSP=>local_8
                                                           RSP = LOAD ram(RSP)
                                                           RSP = INT_ADD RSP, 8:8

If emulating the PCode, instead of rsp being 0x11111111 (as it is on real hardware), after the pop rsp, rsp would be 0x11111119.

pop [rsp] - The Intel manual states:

If the ESP register is used as a base register for addressing a destination operand in memory, the POP instruction computes the effective address of the operand after it increments the ESP register.

This should essentially copy the value at the old stack pointer to the new stack pointer. Consider the following instructions:

    push 0x11111111
    pop [rsp]

Ghidra will disassemble and lift these instructions to the following PCode:

        00401000 68 11 11 11 11      PUSH       0x11111111
                                                           $U41580:8 = COPY 0x11111111:8
                                                           RSP = INT_SUB RSP, 8:8
                                                           STORE ram(RSP), $U41580:8
        00401005 8f 04 24                POP        qword ptr [RSP]
                                                           $Uc000:8 = LOAD ram(RSP)
                                                           STORE ram(RSP), $Uc000:8
                                                           RSP = INT_ADD RSP, 8:8

If we were to emulate these PCode instructions, our new rsp would not point to a copy of the old stack value (as it does on real hardware).

Expected behavior Using sp/esp/rsp as a register operand or base register as a memory operand should generate the correct PCode that reflects the real behavior of the instructions.

Environment (please complete the following information):

  • OS: Ubuntu 20.04
  • Java Version: 11.0.7
  • Ghidra Version: 10.1.4
  • Ghidra Origin: Official release from GitHub

I am happy to discuss the issue more. I found this issue while testing my custom PCode emulator. I single step every instruction in my emulator side by side with Unicorn engine and compare every register post step. I found this discrepancy while emulating QEMU's i386 test suite.

sjcappella avatar May 25 '22 01:05 sjcappella