panda icon indicating copy to clipboard operation
panda copied to clipboard

x86: cmpxchg instruction does not trigger memory callbacks

Open pcworld opened this issue 3 years ago • 6 comments

cmpxchg instructions on x86 (at least cmpxchg16b on x86_64) do not trigger mem_before_write callbacks.

This can be reproduced as follows:

mkdir iso
curl -o iso/cmpxchg.cpp https://gist.githubusercontent.com/pcworld/7d54d59c57bdedf4415c5f5bc62d8662/raw/936e46d33e6262a3d24e7cf3e33923dde2437ac5/cmpxchg.cpp

Then run the following PyPanda script (needs #1118 applied for working virtual memory hook):

#!/usr/bin/env python3
from pandare import Panda
panda = Panda('x86_64', generic='x86_64')

@panda.queue_blocking
def run_commands():
    panda.revert_sync('root')
    panda.copy_to_guest('./iso')
    print(panda.run_serial_cmd('g++ ./iso/cmpxchg.cpp'))
    test_var_address = 0x7fffffffeab0  # this works with the current generic='x86_64' image, may need to be adjusted in the future (the C++ program outputs the relevant address)
    @panda.hook_virt_mem_write(start_address=0x7ffffffde000, end_address=0x7ffffffff000)  # stack
    def write(cpu_state, memory_access_desc):
        if test_var_address in range(memory_access_desc.addr, memory_access_desc.addr + memory_access_desc.size):
            print(f'write from {cpu_state.panda_guest_pc:#x} to {memory_access_desc.addr:#x} value {bytes(memory_access_desc.buf[0:memory_access_desc.size])!r}')
    print('Results: ', panda.run_serial_cmd('./a.out'))
    panda.end_analysis()


panda.enable_precise_pc()
panda.enable_memcb()
panda.run()

Relevant part of the output is:

write from 0x7ffff7dedc50 to 0x7fffffffeab0 value b'\x00\x10\x00\x00\x00\x00\x00\x00'
write from 0x7ffff7def660 to 0x7fffffffeab0 value b'Auth\x00\x00\x00\x00'
write from 0x7ffff7df0f10 to 0x7fffffffeab0 value b'@@UUUU\x00\x00'
write from 0x7ffff7dd7660 to 0x7fffffffeab0 value b'@@UUUU\x00\x00'
write from 0x7ffff7de00be to 0x7fffffffeab0 value b'\xd8\xeb\xff\xff\xff\x7f\x00\x00'
write from 0x7ffff7de00be to 0x7fffffffeab0 value b'\xd8\xeb\xff\xff\xff\x7f\x00\x00'
write from 0x5555555546a0 to 0x7fffffffeab0 value b'\x01\x00\x00\x00\x00\x00\x00\x00'
write from 0x55555555461e to 0x7fffffffeab0 value b'\x01\x00\x00\x00\x00\x00\x00\x00'
write from 0x5555555546c6 to 0x7fffffffeab0 value b'\xad\xfb\xca\xde\x00\x00\x00\x00'
Results:  test has address: 0x7fffffffeab0
test has value: decafbad
test has value: 55555555

As you can see, the test variable from the C++ program is updated to 0x55555555, however there is not any write callback invoked for it. The last write callback is invoked for filling the variable with 0xdecafbad which is from before cmpxchg16b is invoked.

I'm not too familiar with QEMU internals. I've seen that in target/i386/mem_helper.c there are helper functions for cmpxchg instructions (maybe PANDA misses hooking these?), though I don't know if that's relevant.

pcworld avatar Oct 27 '21 14:10 pcworld

Copying in discussion from Slack, I think the places we are missing instrumentation:

atomic_template.h line 63 (for cmpxchg{b,w,l,q})

ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
                              ABI_TYPE cmpv, ABI_TYPE newv EXTRA_ARGS)

And target/i386/mem_helper.c lines 54 and 134 (for cmpxchg{8b,16b})

void helper_cmpxchg8b(CPUX86State *env, target_ulong a0)
void helper_cmpxchg16b(CPUX86State *env, target_ulong a0)

moyix avatar Oct 27 '21 17:10 moyix

#1118 has been merged, so that pypanda script should demonstrate the problematic behavior you're trying to demonstrate on head of the dev branch now, for anyone else who is running it

jamcleod avatar Oct 27 '21 18:10 jamcleod

I did some spelunking through the source and various marcos to figure out what's going on here. I believe the relevant code locations are:

target/i386/mem_helper.c: multiple helper function calls into cpu_stq_data_ra/cpu_ldq_data_ra and other load/store functions.

include/exec/cpu_ldst.h: file which sets various parameters and then includes the file to define these load/store ops (see comment https://github.com/panda-re/panda/blob/dev/include/exec/cpu_ldst.h#L24-L41)

include/exec/cpu_ldst_template.h: file which defines the load/store ops depending on the parameters set by cpu_ldst.h.

The functions in cpu_ldst_template.h which define things like cpu_ldq_data_ra seem to be generating TCG code. Maybe the correct design for our mem callbacks would be to trigger them here or to inject a TCG op to trigger them at runtime? But that might cause redundant callbacks if the places that currently do trigger the callbacks then call these functions. I'm also confused if these functions are called when lifting a block of code to TCG or if they're called during execution when the relevant instructions are being executed.

AndrewFasano avatar Oct 29 '21 18:10 AndrewFasano

This issue has gone stale! If you believe it is still a problem, please comment on this issue or it will be closed in 30 days

github-actions[bot] avatar Dec 29 '21 15:12 github-actions[bot]

yes

pcworld avatar Dec 29 '21 18:12 pcworld

I have a draft PR that covers this. I will note that, as seen in the PR, this is a bug that covers more than the cmpxchg function.

lacraig2 avatar Jan 11 '22 18:01 lacraig2