panda
panda copied to clipboard
Examples on Taint Analysis do not work or out of date
Description
It seems some of the Python examples are not working (anymore). In particular, I am looking into the implementation of taint.py. Running this script as-is does not work / segfaults.
Findings
Based on the errors I received in the stdout, I identified the following problems:
- The string comparison on line 25 should be a bytes comparison or it will never taint anything as
fd_to_fname
returnsbytes
. - The call to
panda.taint_label_ram
requires alabel
argument. - taint2 should be enabled/loaded before
panda.run
is called.
Points 1 and 2 are easy enough to address, but with 3 I have some trouble. When I try add the following lines before the call to panda.run
, PANDA seems to be crashing with a segfault upon calling the panda.taint_enable
.
panda.load_plugin("taint2")
panda.taint_enable()
Example output:
root@5a4e2fa4f486:/local# python taint.py
using generic x86_64
os_name=[linux-64-ubuntu:4.15.0-72-generic-noaslr-nokaslr]
PANDA[core]:os_familyno=2 bits=64 os_details=ubuntu:4.15.0-72-generic-noaslr-nokaslr
[PYPANDA] Panda args: [/usr/local/lib/python3.8/dist-packages/pandare/data/x86_64-softmmu/libpanda-x86_64.so -L /usr/local/share/panda /root/.panda/bionic-server-cloudimg-amd64-noaslr-nokaslr.qcow2 -display none -m 1024 -serial unix:/tmp/pypanda_s2hy2gyig,server,nowait -monitor unix:/tmp/pypanda_m90o4s3uq,server,nowait]
PANDA[osi_linux]:W> kernelinfo bytes [20-23] not read
PANDA[syscalls2]:using profile for linux x64 64-bit
PPP automatically loaded plugin syscalls2
PPP automatically loaded plugin taint2
PANDA[taint2]:propagation via pointer dereference ENABLED
PANDA[taint2]:taint operations inlining DISABLED
PANDA[taint2]:llvm optimizations DISABLED
PANDA[taint2]:taint debugging DISABLED
PANDA[taint2]:detaint if control bits 0 DISABLED
PANDA[taint2]:maximum taint compute number (0=unlimited) 0
PANDA[taint2]:maximum taintset cardinality (0=unlimited) 0
callstack_instr: setting up threaded stack_type
PANDA[taint2]:taint2_enable_taint
taint2: Allocating small fast_shad (0 bytes) using malloc @ 0x29ac260.
taint2: Allocating small fast_shad (19200000 bytes) using malloc @ 0x7f9eb847a010.
taint2: Allocating small fast_shad (384 bytes) using malloc @ 0x2b9d590.
taint2: Allocating small fast_shad (3072 bytes) using malloc @ 0x2cf7dd0.
taint2: Allocating small fast_shad (1030272 bytes) using malloc @ 0x7f9ec8347010.
PANDA[taint2]:LLVM optimizations DISABLED
taint2: Initializing taint ops
taint2: Done initializing taint transformation.
Segmentation fault (core dumped)
root@5a4e2fa4f486:/local#
Stack trace according to GDB (see bottom of issue) seems to indicate it happens the PandaTaintVisitor
class, called by the taint2_enable_taint
function
Am I missing something?
Details
Full (modified) script:
from pandare import Panda
panda = Panda(generic='x86_64')
@panda.queue_blocking
def driver():
panda.revert_sync('root')
print(panda.run_serial_cmd("grep root /etc/passwd"))
panda.end_analysis()
panda.require("osi")
panda.require("osi_linux")
def fd_to_fname(cpu, fd):
proc = panda.plugins['osi'].get_current_process(cpu)
procname = panda.ffi.string(proc.name) if proc != panda.ffi.NULL else "error"
fname_ptr = panda.plugins['osi_linux'].osi_linux_fd_to_filename(cpu, proc, fd)
fname = panda.ffi.string(fname_ptr) if fname_ptr != panda.ffi.NULL else "error"
return fname
@panda.ppp("syscalls2", "on_sys_read_return")
def read(cpu, tb, fd, buf, cnt):
fname = fd_to_fname(cpu, fd)
print(f"read {fname}")
if fname == b"/etc/passwd": # <-- changed to bytes string
for idx in range(cnt):
panda.taint_label_ram(buf+idx, 1) # <-- added taint label 1 (not sure about the expected type?)
@panda.ppp("taint2", "on_branch2")
def something(addr, size, from_helper, tainted):
print("Tainted branch")
# Added plugin loading/enabling
panda.load_plugin("taint2")
panda.taint_enable()
panda.run()
GDB stack trace
0x00007fc244615bb8 in llvm::PandaTaintVisitor::insertStateOp(llvm::Instruction&) ()
from /usr/local/lib/panda/x86_64/panda_taint2.so
(gdb) bt
#0 0x00007fc244615bb8 in llvm::PandaTaintVisitor::insertStateOp(llvm::Instruction&) ()
from /usr/local/lib/panda/x86_64/panda_taint2.so
#1 0x00007fc244619d35 in llvm::PandaTaintFunctionPass::runOnFunction(llvm::Function&) ()
from /usr/local/lib/panda/x86_64/panda_taint2.so
#2 0x00007fc2446094c1 in taint2_enable_taint () from /usr/local/lib/panda/x86_64/panda_taint2.so
#3 0x00007fc259622ff5 in ?? () from /lib/x86_64-linux-gnu/libffi.so.7
#4 0x00007fc25962240a in ?? () from /lib/x86_64-linux-gnu/libffi.so.7
#5 0x00007fc2588810a7 in cdata_call (cd=<optimized out>, args=<optimized out>, kwds=<optimized out>)
at src/c/_cffi_backend.c:3201
#6 0x00000000005f7506 in _PyObject_MakeTpCall ()
#7 0x0000000000570b8e in _PyEval_EvalFrameDefault ()
#8 0x00000000005f6ce6 in _PyFunction_Vectorcall ()
#9 0x000000000056b619 in _PyEval_EvalFrameDefault ()
#10 0x00000000005697da in _PyEval_EvalCodeWithName ()
#11 0x000000000068e547 in PyEval_EvalCode ()
#12 0x000000000067dbf1 in ?? ()
#13 0x000000000067dc6f in ?? ()
#14 0x000000000067dd11 in ?? ()
#15 0x000000000067fe37 in PyRun_SimpleFileExFlags ()
#16 0x00000000006b7c82 in Py_RunMain ()
#17 0x00000000006b800d in Py_BytesMain ()
#18 0x00007fc259d69083 in __libc_start_main (main=0x4ef140 <main>, argc=2, argv=0x7fffa36cf438, init=<optimized out>,
fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffa36cf428) at ../csu/libc-start.c:308
#19 0x00000000005fb85e in _start ()
Based on taint_x86_64.py I have come to realize I should enable the taint analysis after the machine is set up, e.g., inside of a @panda.cb_after_machine_init
callback. Is this the typical approach to take or is there another (better) way?