rainbow
rainbow copied to clipboard
Unable to trace glibc functions
Hi guys,
I am facing some issues when trying to trace binaries with calls to glibc functions. I have written a C toy example:
#include <string.h>
int main()
{
char str[50];
memset(str, 0x0, 50);
return 0;
}
and the corresponding rainbow script (very simple as well):
from rainbow.generics import rainbow_x64
e = rainbow_x64()
e.load("memset_ex1", typ=".elf")
e.mem_trace = 1
e.trace_regs = 1
e.function_calls = 1
e.start(e.functions["main"], 0)
The resulting output can be read in this file: output_rainbow_memset.txt
I remember having similar issues with unicorn when glibc was not mapped (so a hook to skip the calling instruction was enough) but I assume the other binaries in your examples also have glibc functions. Am I missing something?
Thanks in advance.
Hi,
This is the same problem that you would get with Unicorn indeed. Here you have two options.
First one is to hook the calls you are interested in and reimplement them in Python. For example, assuming I have my calling convention right:
e.load("memset_ex1", typ=".elf")
def my_memset(emu):
# 'emu' is the whole rainbow instance
addr = emu['rdi']
value = emu['rsi']
length = emu['rdx']
emu[addr:addr+length] = value
return True # Tell Unicorn/Rainbow that you return to the caller site
e.stubbed_functions['memset'] = my_memset
This is useful if you need to change the behaviour of a function like in this example, but painful if you have more complex library calls to implement.
The second option is, as you mentioned, mapping the library into the emulator's memory, and then fixing the relocations by hand. I have not experimented yet with this solution.
What about skipping, in this case, the call to memset (i.e., not tracing it as you can do when directly using Unicorn):
instructions_skip_list = [addr_of_memset]
def my_hook(mu, address, size, user_data):
if address in instructions_skip_list:
mu.reg_write(UC_X86_REG_RIP, address+size)
Do you have an API for doing that? (If not, it could be an interesting feature since in many cases these kind of operations are just noise, so this would help to reduce the length of traces.)
Yes you can do it like this (pretty much like the reimplementation option, only this time the function does not nothing but return):
def bypass(emu):
return True
e.stubbed_functions['memset'] = bypass
@snx90 Did you manage to do what you wanted ?
Hi @yhql, yes I did. I followed your approach which allowed me to trace simple libc functions. Not a perfect solution, but it works for my current tasks.
Hi, I tried the same approach inspired by the examples (using a bypass function) but it seems to not take it when emulating the function. I've the following error:
5FF4 add byte ptr [rax], al ;
5FF6 add byte ptr [rax], al ;
5FF8 add byte ptr [rax], al ;
5FFA add byte ptr [rax], al ;Exception ignored on calling ctypes callback function: <bound method Uc._hook_mem_invalid_cb of <unicorn.unicorn.Uc object at 0x7f1f3664a490>>
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/unicorn/unicorn.py", line 443, in _hook_mem_invalid_cb
return cb(self, access, address, size, value, data)
File "/home/user/.local/lib/python3.8/site-packages/rainbow-1.0-py3.8.egg/rainbow/rainbow.py", line 371, in unmapped_hook
Exception: Unmapped fetch at 0x6000 (Emu stopped in 5ffc)
is it a known problem?
Do you have an example on how to map the library as suggested in the "second option"?
Hi, Sorry I am not sure what is happening from the trace you're showing, can you psot an excerpt from the python code you use to set up the execution ? Sadly I don't have an example yet on the second option.
here you have the code:
## definition of bypass function
device = rainbow_x64(sca_mode=False, local_vars=globals())
device.load("tracer.elf", typ=".elf", verbose=True)
## code to set arguments and everything else
device.stubbed_functions["memset"] = bypass
device.start(device.functions["encrypt_ecb"], 0)
from what I see in the trace the execution runs the code from the address of memset function, the bypass is not set (indeed looking at device.stubbed_functions the entry for memset is not set, but it works as a charm for non-imported symbols) i am using the last version of rainbow and unicorn
Is "memset" also in device.function_names.keys()
? If not the hook will not work actually.
If that's the case that might mean the ELF parser did not register this symbol, which should be fixable.
keys()
only shows the addresses, but memset
is not in the values()
either.
lief 0.10.1 is installed
True ! My bad. So this seems like the loader does not fetch all external symbols.
I'll try to reproduce and find a fix.
In the meantime, you can probably try to add "memset" manually into device.functions
and device.function_names
, knowing its address.
thanks a lot!
Do you have a mangled symbol for memset
instead in the list of functions (like memset@@GLIBC_X.X.X
) ?
EDIT: The trace you get is most likely because the symbol is here but defined at address 0 because it is a dynamic relocation. So it keeps executing a bunch of zeroes (which will be interpreted as add byte ptr [rax], al
) until it hits an address that is not mapped anymore.
No.
I've memset@@GLIBC_X.X.X
.