panda icon indicating copy to clipboard operation
panda copied to clipboard

cpuid hypercall not processed

Open vogr opened this issue 2 years ago • 1 comments

Hello,

I am trying to use the hypercall feature of panda (both for custom callbacks and for hypercall callbacks in taint2), but I can't seem to trigger the callback using the cpuid instruction. I tested both x86 and x86_64 targets. I'm using the python interface in the Docker image.

For instance with this python script:

#!/usr/bin/env python3

from pandare import Panda

panda = Panda(generic="i386")
recording_name = "test_recording"

@panda.queue_blocking
def run_cmd():
    panda.record_cmd(
            "./a.out",
            recording_name=recording_name,
            copy_directory="test",
            setup_command="cd /root/test",
            )

    panda.end_analysis()


panda.run()

panda.set_pandalog("pandalog.plog")

panda.load_plugin("taint2")


@panda.cb_guest_hypercall
def hypercall_handler(env):
    panda.arch.dump_regs(env)
    print("pc= 0x{:x}".format(panda.arch.get_pc(env)))
    return True

print("taint:", panda.taint_enabled())

panda.run_replay(recording_name)

print("taint:", panda.taint_enabled())

and this C program:

int main()
{
    asm volatile(
        "mov $6, %eax\n\t"
        "mov $7, %ebx\n\t"
        "mov $8, %ecx\n\t"
        "cpuid\n\t"
    );
}

compiled with gcc:

$ gcc -m32 -no-pie a.c

I get the following log:

root@1ae1ea3b9c44:/app/test# ./panda.py 
using generic i386
os_name=[linux-32-ubuntu:4.4.200-170-generic]
PANDA[core]:os_familyno=2 bits=32 os_details=ubuntu:4.4.200-170-generic
[PYPANDA] Panda args: [/panda/build/i386-softmmu/libpanda-i386.so -L /panda/build/pc-bios /root/.panda/ubuntu_1604_x86.qcow -display none -m 1024 -serial unix:/tmp/pypanda_sd7oljl9e,server,nowait -monitor unix:/tmp/pypanda_mogb3a0em,server,nowait]
Running setup command cd /root/test
Setup command results: -bash: cd: /root/test: No such file or directory
writing snapshot:       ./test_recording-rr-snp
opening nondet log for write:   ./test_recording-rr-nondet.log
Finished recording
pandalogging to [pandalog.plog]
PANDA[core]:loading required plugin taint2
PANDA[core]:initializing taint2
PANDA[taint2]:propagation via pointer dereference ENABLED
PANDA[taint2]:taint operations inlining DISABLED
PANDA[taint2]:llvm optimizations DISABLED
PANDA[taint2]:taint debugging DISABLED
PANDA[taint2]:detaint if control bits 0 DISABLED
PANDA[taint2]:maximum taint compute number (0=unlimited) 0
PANDA[taint2]:maximum taintset cardinality (0=unlimited) 0
PANDA[core]:loading required plugin callstack_instr
PANDA[core]:initializing callstack_instr
callstack_instr:  setting up threaded stack_type
PANDA[core]:loading required plugin osi
PANDA[core]:initializing osi
PANDA[core]:loading required plugin osi_linux
PANDA[core]:initializing osi_linux
PANDA[osi_linux]:W> failed to read task.start_time_offset
PANDA[osi_linux]:W> failed to read task.switch_task_hook_addr
PANDA[osi_linux]:W> kernelinfo bytes [20-23] not read
PANDA[core]:loading required plugin syscalls2
PANDA[core]:initializing syscalls2
PANDA[syscalls2]:using profile for linux x86 32-bit
PANDA[core]:loading required plugin hooks
PANDA[core]:initializing hooks
taint: 0
loading snapshot
... done.
opening nondet log for read :   ./test_recording-rr-nondet.log
EAX: 0x0        -> 0x00000000
ECX: 0xb76fdbc8 -> 0xb76fdbc8
EDX: 0xb7722000 -> 0xb7722000   -> 0x00023f40
EBX: 0xb7722000 -> 0xb7722000   -> 0x00023f40
ESP: 0xbfec19fc -> 0xbfec19fc   -> 0xb7722000   -> 0x00023f40
EBP: 0x8048034  -> 0x08048034
ESI: 0x9        -> 0x00000009
EDI: 0xb7722000 -> 0xb7722000   -> 0x00023f40
pc= 0xb77141a1
RR rr_assertion `current_item' failed at /panda/panda/src/rr/rr_log.c:978
Current log point:
{guest_instr_count=2491650}
Next log entry type: RR_INPUT_8
Current replay point:
{guest_instr_count=2491610}
Aborted (core dumped)

Current behavior:

  • the callback is called once, but this does not correspond to the CPUID call in my program. Indeed if I remove the CPUID call, I get the same panda output. (additionally, the values in the registers do not correspond to the values present there before the CPUID call)
  • the taint2 hypercall callback is apparently not called ; the value 6 in eax should trigger the action ENABLE_TAINT

Do you know if something is wrong in my setup, or if there is indeed a problem with panda?

Thank you in advance for your help

Valentin

vogr avatar Jun 04 '22 10:06 vogr

There's a chance the issue is that you're trying to take a recording and replay it in the same script - we've seen some weird issues about that in the past and I can't recall if we've fixed them. If you first take the recording, then run the analysis on the reply does anything change?

AndrewFasano avatar Aug 08 '22 13:08 AndrewFasano

Thank you for the reply @AndrewFasano. I tried splitting the script into two, but this did not change the behavior I observed.

However I think I found the issue with my script: the hypercall callback always returns True, even for non-hypercall uses of cpuid!

I changed my code to detect hypercalls by checking for a "magic value" (0x4321 in ebx) as recommended in the docs.

With main.c:

int main()
{
    asm volatile(
        "mov $0x6, %eax\n\t"
        "mov $0x4321, %ebx\n\t"
        "mov $0x9876, %ecx\n\t"
        "cpuid\n\t"
    );
}

and panda.py (note that now we check the value in ebx before deciding on the return value of the callback):

#!/usr/bin/env python3

from pandare import Panda

panda = Panda(generic="i386")
recording_name = "test_recording"

@panda.queue_blocking
def run_cmd():
    panda.record_cmd(
            "./main",
            recording_name=recording_name,
            copy_directory="app",
            setup_command="cd /root/app",
            )

    panda.end_analysis()


panda.run()


panda.set_pandalog("pandalog.plog")

panda.load_plugin("taint2")


@panda.cb_guest_hypercall
def hypercall_handler(env):
    ebx = panda.arch.get_reg(env, "EBX")
    if ebx != 0x4321:
        return False
    panda.arch.dump_regs(env)
    print("pc= 0x{:x}".format(panda.arch.get_pc(env)))
    return True

print("taint:", panda.taint_enabled())

panda.run_replay(recording_name)

print("taint:", panda.taint_enabled())

With this change everything works as expected! Note that taint gets enabled by the hypercall, and we can see the expected values in eax, ebx and ecx in the hypercall log.

root:/app/test# ./test.py 
using generic i386
os_name=[linux-32-ubuntu:4.4.200-170-generic]
PANDA[core]:os_familyno=2 bits=32 os_details=ubuntu:4.4.200-170-generic
[PYPANDA] Panda args: [/panda/build/i386-softmmu/libpanda-i386.so -L /panda/build/pc-bios /root/.panda/ubuntu_1604_x86.qcow -display none -m 1024 -serial unix:/tmp/pypanda_swnafkx4x,server,nowait -monitor unix:/tmp/pypanda_mncwl9dmh,server,nowait]
Running setup command cd /root/app
Setup command results: 
writing snapshot:	./test_recording-rr-snp
opening nondet log for write:	./test_recording-rr-nondet.log
Finished recording
pandalogging to [pandalog.plog]
PANDA[core]:loading required plugin taint2
PANDA[core]:initializing taint2
PANDA[taint2]:propagation via pointer dereference ENABLED
PANDA[taint2]:taint operations inlining DISABLED
PANDA[taint2]:llvm optimizations DISABLED
PANDA[taint2]:taint debugging DISABLED
PANDA[taint2]:detaint if control bits 0 DISABLED
PANDA[taint2]:maximum taint compute number (0=unlimited) 0
PANDA[taint2]:maximum taintset cardinality (0=unlimited) 0
PANDA[core]:loading required plugin callstack_instr
PANDA[core]:initializing callstack_instr
callstack_instr:  setting up threaded stack_type
PANDA[core]:loading required plugin osi
PANDA[core]:initializing osi
PANDA[core]:loading required plugin osi_linux
PANDA[core]:initializing osi_linux
PANDA[osi_linux]:W> failed to read task.start_time_offset
PANDA[osi_linux]:W> failed to read task.switch_task_hook_addr
PANDA[osi_linux]:W> kernelinfo bytes [20-23] not read
PANDA[core]:loading required plugin syscalls2
PANDA[core]:initializing syscalls2
PANDA[syscalls2]:using profile for linux x86 32-bit
PANDA[core]:loading required plugin hooks
PANDA[core]:initializing hooks
taint: 0
loading snapshot
... done.
opening nondet log for read :	./test_recording-rr-nondet.log
./test_recording-rr-nondet.log:     2708521 instrs total.
taint2: enabling taint processing
PANDA[taint2]:taint2_enable_taint
taint2: Allocating large fast_shad (25769803776 bytes).
taint2: Allocating small fast_shad (19200000 bytes) using malloc @ 0x7efe861b8010.
taint2: Allocating small fast_shad (384 bytes) using malloc @ 0x7f04cc4aa220.
taint2: Allocating small fast_shad (768 bytes) using malloc @ 0x7f04cc3800a0.
taint2: Allocating small fast_shad (986112 bytes) using malloc @ 0x7f04ccbe9e00.
PANDA[taint2]:LLVM optimizations DISABLED
taint2: Initializing taint ops
taint2: Done initializing taint transformation.
PANDA[taint2]:Done processing helper functions for taint.
Done verifying module. Running...
EAX: 0x6	-> 0x00000006	
ECX: 0x9876	-> 0x00009876	
EDX: 0xbf950144	-> 0xbf950144	-> 0x00000000	
EBX: 0x4321	-> 0x00004321	
ESP: 0xbf950118	-> 0xbf950118	-> 0x00000000	
EBP: 0xbf950118	-> 0xbf950118	-> 0x00000000	
ESI: 0xb77b7000	-> 0xb77b7000	-> 0x001b2db0	
EDI: 0xb77b7000	-> 0xb77b7000	-> 0x001b2db0	
pc= 0x8049196
Replay completed successfully
writing chunk 0 of pandalog, 26 / 19 = 1.37 compression, 1 entries
header: version=2  dir_pos=147 chunk_size=16777216
Exiting cpu_handle_execption loop
taint: False

I hope this helps anyone stumbling on the issue in the future!

vogr avatar Aug 15 '22 09:08 vogr