SIGSEGV when using asan in aarch64 qemu mode
IMPORTANT
- You have verified that the issue to be present in the current
mainbranch Yes
$ git log | head -n 1
commit 453d733a3562dcea290265dafec1908832f97658
Describe the bug
I first encountered this issue when reproducing the result of android fuzzer in libafl_qemu_artifact. When I added --features asan to the building process of the fuzzer, it crashed and the log showed:
qemu: QEMU internal SIGSEGV {code=MAPERR, addr=0x1555d554de02}
Segmentation fault(core dumped)
I debugged this issue thoroughly and carefully using gdb-multiarch, and found that it is caused by a dereference failure of shadow memory address:
0x5555557307b5 <libafl_qemu::modules::usermode::asan::AsanGiovese::fake_syscall+341> lea rax, [rip + 0x8ac3e4] RAX => 0x555555fdcba0 (guest_base) ◂— 0
0x5555557307bc <libafl_qemu::modules::usermode::asan::AsanGiovese::fake_syscall+348> mov rcx, qword ptr [rax] RCX, [guest_base] => 0
0x5555557307bf <libafl_qemu::modules::usermode::asan::AsanGiovese::fake_syscall+351> xor eax, eax EAX => 0
0x5555557307c1 <libafl_qemu::modules::usermode::asan::AsanGiovese::fake_syscall+353> nop word ptr cs:[rax + rax]
0x5555557307d0 <libafl_qemu::modules::usermode::asan::AsanGiovese::fake_syscall+368> lea rdx, [rcx + rbx] RDX => 0xaaaaaaaaf010 ◂— 0
0x5555557307d4 <libafl_qemu::modules::usermode::asan::AsanGiovese::fake_syscall+372> sar rdx, 3
► 0x5555557307d8 <libafl_qemu::modules::usermode::asan::AsanGiovese::fake_syscall+376> mov byte ptr [rdx + 0x7fff8000], 0 <Cannot dereference [0x1555d554de02]>
0x5555557307df <libafl_qemu::modules::usermode::asan::AsanGiovese::fake_syscall+383> add rbx, 8
This is in function "libafl_qemu::modules::usermode::asan::AsanGiovese::unposion", which is in libafl_qemu/src/modules/usermode/asan.rs:
pub fn unpoison(qemu: Qemu, addr: GuestAddr, n: usize) -> bool {
unsafe {
let n = n as isize;
let mut start = addr;
let end = start.wrapping_add(n as GuestAddr);
while start < end {
let h = qemu.g2h::<*const c_void>(start) as isize;
let shadow_addr = ((h >> 3) as *mut i8).offset(SHADOW_OFFSET);
► *shadow_addr = 0;
start = (start).wrapping_add(8);
}
true
}
}
In my case, the original start addr is 0xaaaaaaaaf010,n is 0x158,end addr is 0xaaaaaaaaf168. When it execute (h >> 3), 0xaaaaaaaaf010 becomes 0x155555555e02. The SHADOW_OFFSET is 0x7fff8000, so shadow_addr is 0x1555d554de02. Both 0x155555555e02 and 0x1555d554de02 is not addressable:
pwndbg>x/x 0x155555555e02
0x155555555e02:
Cannot access memory at address 0x155555555e02
This happens in libafl-0.11.2, and I also tried 0.13.2, it still exists.
-------------------------------------8<----------------------------------
I saw this similar issue 2579 , so I tried the example fuzzer qemu_launcher in the latest main version (as I said in the begining). In my case, the --features=x86_64, asan works well:
pwndbg> p/x end
$3= 0x7ffff5b004a8
pwndbg> p/x start
$4= 0x7ffff5b002a0
pwndbg> p n
$5 =<optimized out>
pwndbg> p/x end-start
$6 = 0x208
==============
0x7ffff5b002a0 >> 3 = 0xffffeb60054
==============
pwndbg> x/x 0xffffeb60054
0xffffeb60054: 0x00000000
The start addr is 0x7ffff5b004a8. After right shift it becomes 0xffffeb60054, and this addr is addressable.
But in --features=aarch64, asan, it crashes because of the same reason but in different code area:
pwndbg> set args "--input" "./corpus" "--output" "/home/LibAFL/fuzzers/binary_only/qemu_launcher/target/aarch64/output/" "--cores" "0-7" "--asan-cores" "0-3" "--cmplog-cores" "2-5" "--verbose" "--" "/home/LibAFL/fuzz
ers/binary_only/qemu_launcher/target/aarch64/libpng-harness-aarch64"
pwndbg> r
Starting program: /home/LibAFL/fuzzers/binary_only/qemu_launcher/target/aarch64/release/qemu_launcher-aarch64 "--input" "./corpus" "--output" "/home/LibAFL/fuzzers/binary_only/qemu_launcher/target/aarch64/output/" "--cores" "0-7" "--asan-cores" "0-3" "--cmplog-cores" "2-5" "--verbose" "--" "/home/LibAFL/fuzzers/binary_only/qemu_launcher/target/aarch64/libpng-harness-aarch64"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7800640 (LWP 3793330)]
Thread 1 "qemu_launcher-a" received signal SIGSEGV, Segmentation fault.
libafl_qemu::modules::usermode::asan::AsanModule::read_8 (self=0x55555923c428, pc=<optimized out>, addr=<optimized out>) at /home/LibAFL/libafl_qemu/src/modules/usermode/asan.rs:868
868 if self.enabled() && AsanGiovese::is_invalid_access_8(qemu, addr) {
Warning: the current language does not match this frame.
LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA
───────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]───────────────────────────────
RAX 0x55555923c400 ◂— 0
RBX 0xaaaaaab0ff60 —▸ 0x7ffff79fdb58 ◂— 0xadb771622a56ae00
RCX 0xaaaaaab0ff60 —▸ 0x7ffff79fdb58 ◂— 0xadb771622a56ae00
RDX 0xaaaaaaaa1788 ◂— 0xf9400001f947b000
RDI 0x55555923ded0 ◂— 0
RSI 0x155555561fec
R8 0xaaaaaab0ff60 —▸ 0x7ffff79fdb58 ◂— 0xadb771622a56ae00
R9 0x55555564efb0 (libafl_qemu::modules::usermode::asan::trace_read8_asan) ◂— mov rax, qword ptr [rdi + 0x120]
R10 0
R11 0xffffedbffcd ◂— 0
R12 0x7ffff79fdb58 ◂— 0xadb771622a56ae00
R13 0xaaaaaaaa1fa0 ◂— 0x2a0003e152800000
R14 0x7fffe8000100 (code_gen_buffer+211) ◂— mov ebx, dword ptr [rbp - 0x10] /* 0xce8c0fdb85f05d8b */
R15 0x7fffe8000040 (code_gen_buffer+19) —▸ 0xaaaaaaaa176c ◂— 0xa9017bfdd10303ff
RBP 0x5555591df880 ◂— 0
RSP 0x7fffffffa3b8 —▸ 0x7fffe8000273 (code_gen_buffer+582) ◂— mov rbx, qword ptr [rbp + 0x40] /* 0x49e38b4c405d8b48 */
RIP 0x55555564efce (libafl_qemu::modules::usermode::asan::trace_read8_asan+30) ◂— cmp byte ptr [rsi + 0x7fff8000], 0
────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]────────────────────────────────────────
► 0x55555564efce <libafl_qemu::modules::usermode::asan::trace_read8_asan+30> cmp byte ptr [rsi + 0x7fff8000], 0
0x55555564efd5 <libafl_qemu::modules::usermode::asan::trace_read8_asan+37> je libafl_qemu::modules::usermode::asan::trace_read8_asan+90 <libafl_qemu::modules::usermode::asan::trace_read8_asan+90>
0x55555564efd7 <libafl_qemu::modules::usermode::asan::trace_read8_asan+39> sub rsp, 0x28
0x55555564efdb <libafl_qemu::modules::usermode::asan::trace_read8_asan+43> mov rdi, qword ptr [rax + 0x48]
0x55555564efdf <libafl_qemu::modules::usermode::asan::trace_read8_asan+47> mov qword ptr [rsp + 0x10], rcx
0x55555564efe4 <libafl_qemu::modules::usermode::asan::trace_read8_asan+52> mov qword ptr [rsp + 0x18], 8
0x55555564efed <libafl_qemu::modules::usermode::asan::trace_read8_asan+61> mov qword ptr [rsp + 8], 2
0x55555564eff6 <libafl_qemu::modules::usermode::asan::trace_read8_asan+70> lea rax, [rsp + 8]
0x55555564effb <libafl_qemu::modules::usermode::asan::trace_read8_asan+75> mov rsi, rdx
0x55555564effe <libafl_qemu::modules::usermode::asan::trace_read8_asan+78> mov rdx, rax
0x55555564f001 <libafl_qemu::modules::usermode::asan::trace_read8_asan+81> call libafl_qemu::modules::usermode::asan::AsanGiovese::report_or_crash <libafl_qemu::modules::usermode::asan::AsanGiovese::report_or_crash>
─────────────────────────────────────────────────[ SOURCE (CODE) ]──────────────────────────────────────────────────
In file: /home/LibAFL/libafl_qemu/src/modules/usermode/asan.rs:868
863 self.rt.report_or_crash(qemu, pc, AsanError::Read(addr, 4));
864 }
865 }
866
867 pub fn read_8(&mut self, qemu: Qemu, pc: GuestAddr, addr: GuestAddr) {
► 868 if self.enabled() && AsanGiovese::is_invalid_access_8(qemu, addr) {
869 self.rt.report_or_crash(qemu, pc, AsanError::Read(addr, 8));
870 }
871 }
872
873 pub fn read_n(&mut self, qemu: Qemu, pc: GuestAddr, addr: GuestAddr, size: usize) {
─────────────────────────────────────────────────────[ STACK ]──────────────────────────────────────────────────────
00:0000│ rsp 0x7fffffffa3b8 —▸ 0x7fffe8000273 (code_gen_buffer+582) ◂— mov rbx, qword ptr [rbp + 0x40] /* 0x49e38b4c405d8b48 */
01:0008│ 0x7fffffffa3c0 —▸ 0x5555591b9218 (tcg_init_ctx+2008) —▸ 0x6201010203 ◂— 0
02:0010│ 0x7fffffffa3c8 ◂— 0
03:0018│ 0x7fffffffa3d0 —▸ 0x5555591ba210 (tcg_init_ctx+6096) —▸ 0x800101000c ◂— 0
04:0020│ 0x7fffffffa3d8 ◂— 0x1530
05:0028│ 0x7fffffffa3e0 ◂— 0x5030 /* '0P' */
06:0030│ 0x7fffffffa3e8 ◂— 0
07:0038│ 0x7fffffffa3f0 ◂— 7
───────────────────────────────────────────────────[ BACKTRACE ]────────────────────────────────────────────────────
► 0 0x55555564efce libafl_qemu::modules::usermode::asan::trace_read8_asan+30
1 0x55555564efce libafl_qemu::modules::usermode::asan::trace_read8_asan+30
2 0x7fffe8000273 code_gen_buffer+582
3 0x555555b92470 cpu_tb_exec+80
4 0x555555b93055 cpu_exec_loop.constprop+805
5 0x555555b93055 cpu_exec_loop.constprop+805
6 0x555555b93639 cpu_exec_setjmp.isra+41
7 0x555555b936cb cpu_exec+107
───────────────────────────────────────────────[ THREADS (2 TOTAL) ]────────────────────────────────────────────────
► 1 "qemu_launcher-a" stopped: 0x55555564efce <libafl_qemu::modules::usermode::asan::trace_read8_asan+30>
2 "qemu_launcher-a" stopped: 0x7ffff7b1e88d <syscall+29>
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Here, it crashes at accessing [rsi + 0x7fff8000], which looks the same as the issue before.
pwndbg> x/x (0x155555561fec+0x7fff8000)
0x1555d5559fec: Cannot access memory at address 0x1555d5559fec
=====================================================
0x00005555556445fd <+13>: lea rsi,[rip+0x2b44bc4] # 0x5555581891c8 <guest_base>
0x0000555555644604 <+20>: mov rsi,QWORD PTR [rsi]
0x0000555555644607 <+23>: add rsi,rcx
0x000055555564460a <+26>: sar rsi,0x3
=> 0x000055555564460e <+30>: cmp BYTE PTR [rsi+0x7fff8000],0x0
I am new to qasan, so now I am trying to figure out why this happened. Can you offer some help to this issue? Thank you very much!
To Reproduce
- Steps to reproduce the android fuzzer behavior:
I do totally the same as the instruction in libafl_qemu_artifact.
- Steps to reproduce the qemu_launcher behavior:
git clone https://github.com/AFLplusplus/LibAFL.git
cd LibAFL/fuzzers/qemu/qemu_launcher
export LLVM_CONFIG="llvm-config-15"
export QEMU_LD_PREFIX=/path/to/aarch64-linux-gnu/
cargo make aarch64
I modified the Makefile.toml to add the feature simplemgr in the case of clarity.
- Steps to debug qemu_launcher:
gdb-multiarch target/aarch64/release/qemu_launcher-aarch64
pwndbg> set args --input ./corpus/ --output target/aarch64/output/ --cores 0-1 --asan-cores 0 --cmplog-cores 1 -- target/aarch64/libpng-harness-aarch64
My environment info:
lsb_release -a && \
arch && \
llvm-config --version && \
rustup toolchain list && \
rustc -V
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
x86_64
14.0.0 (I export LLVM_CONFIG=llvm-config-15 when building the fuzzer's project)
stable-x86_64-unknown-linux-gnu (default)
nightly-x86_64-unknown-linux-gnu
rustc 1.80.1 (3f5fd8dd4 2024-08-06)
Expected behavior The fuzzer works well as in x86_64 architecture.
About android fuzzer, I found a strange thing: The address of my harnessDecode is
harnessDecode @ 0xaaaaaaaabb70. Another developer was able to successfully run the android fuzzer with asan, and his harnessDecode address started with 0x7fff: harnessDecode @ 0x7ffff7fb4b68.
I think this might be the key to the matter: because in the case of x86_64 qemu_launcher the addresses are laid out similarly to this.
Using pmap:
# his process space
pmap 89530 |grep harness
00007ffff7fb3000 4K r---- harness
00007ffff7fb4000 8K r---- harness
00007ffff7fb6000 4K r---- harness
00007ffff7fb7000 4K rw--- harness
=============================
# my process space
pmap 1809467 |grep harness
0000aaaaaaaab000 4K r---- harness
0000aaaaaaaad000 4K r---- harness
I am very confused about this...
thank you for the detailed report. i just saw you closed the issue, is it because your problem is solved?
thank you for the detailed report. i just saw you closed the issue, is it because your problem is solved?
No, I haven't solved it totally. I closed the issue because my colleague was able to run the android fuzzer with aarch architecture and asan. So I guess that my SIGSEGV happens due to some of my wrong settings, not a bug in the project. But I'd be grateful if you can help.
At now, I only noticed the address mapping of the harness is different between my colleague's and mine:
# his process space
pmap 89530 |grep harness
00007ffff7fb3000 4K r---- harness
00007ffff7fb4000 8K r---- harness
00007ffff7fb6000 4K r---- harness
00007ffff7fb7000 4K rw--- harness
=============================
# my process space
pmap 1809467 |grep harness
0000aaaaaaaab000 4K r---- harness
0000aaaaaaaad000 4K r---- harness
Sorry for closing and opening the issue again, as this is my first time submitting an issue :)
I debugged further and found that in my colleague's machine, mmap syscall returns an address starting with 0x7fff, but mine returns 0xaaaaaaaab000. The allocate request are the same: both are 0xaaaaaaaab000, because this is ELF_ET_DYN_BASE defined in elf.h.
__GI___mmap64 (addr=addr@entry=0xaaaaaaaab000, len=len@entry=16384, prot=prot@entry=0, flags=flags@entry=16418, fd=fd@entry=-1, offset=0) at ../sysdeps/unix/sysv/linux/mmap64.c:47
---------------------------------------8<-----------------------------------
__GI___mmap64 (addr=0xaaaaaaaab000, len=len@entry=20480, prot=prot@entry=0, flags=flags@entry=16418, fd=fd@entry=-1, offset=offset@entry=0) at ../sysdeps/unix/sysv/linux/mmap64.c:47
This mmap happens in init_qemu_with_asan:
#0 mmap_h_eq_g (offset=<optimized out>, fd=-1, page_flags=8, flags=16418, host_prot=0, len=16384, start=187649984475136) at ../linux-user/mmap.c:566
#1 target_mmap__locked (offset=<optimized out>, fd=-1, page_flags=8, flags=16418, target_prot=0, len=16384, start=187649984475136) at ../linux-user/mmap.c:894
#2 target_mmap (start=<optimized out>, len=16384, len@entry=12296, target_prot=target_prot@entry=0, flags=16418, fd=fd@entry=-1, offset=offset@entry=0) at ../linux-user/mmap.c:949
#3 0x0000555555a084a0 in load_elf_image (image_name=0x555555fdcbc0 <real_exec_path> "libafl_qemu_artifacts/android_fuzzer/harness", src=src@entry=0x555555fdca20 <bprm+1024>, info=info@entry=0x555555fdca80 <libafl_image_info>, ehdr=ehdr@entry=0x7fffffffca90, pinterp_name=pinterp_name@entry=0x7fffffffc850) at ../linux-user/elfload.c:3412
#4 0x0000555555a08e64 in load_elf_binary (bprm=bprm@entry=0x555555fdc620 <bprm>, info=info@entry=0x555555fdca80 <libafl_image_info>) at ../linux-user/elfload.c:3868
#5 0x0000555555a0b3ab in loader_exec (fdexec=fdexec@entry=3, filename=<optimized out>, argv=argv@entry=0x555555ffa9b0, envp=envp@entry=0x55555605c860, regs=regs@entry=0x7fffffffcca0, infop=infop@entry=0x555555fdca80 <libafl_image_info>, bprm=<optimized out>) at ../linux-user/linuxload.c:163
#6 0x0000555555a0c877 in qemu_user_init (argc=6, argv=0x555555ff2ca0, envp=<optimized out>) at ../linux-user/main.c:1007
#7 0x000055555562a8dd in libafl_qemu::qemu::Qemu::init (args=..., env=...) at /src/qemu/mod.rs:557
#8 libafl_qemu::modules::usermode::asan::init_qemu_with_asan (args=0x7fffffffd1e0, env=...) at /src/modules/usermode/asan.rs:719
#9 android_fuzzer::main () at src/main.rs:236
I do know that mmap may behave differently on different systems, while I know little about the details. However, what happened on my machine shows that the address 0xaaaaaaaab000 can be allocated successfully. In this case, qasan's unpoison algorithm does not seem to work, because the address after right shift cannot be dereferenced and accessed.
I tried this code in two machines, and I got different results.
#include <stdio.h>
#include <sys/mman.h>
int main(void){
void* ptr = NULL;
ptr = mmap(0xaaaaaaaab000, 16384, 0, 16418, -1, 0);
printf("%p\n", ptr);
}
On my server which is used to run android fuzzer previously, it prints 0xaaaaaaaab000. On another one, it prints 0x7f165e17b000. I think this is the key reason to this issue.
Do you think this case will be taken into account by the libafl implementation? If not, I will close this issue.
it makes sense to me that you get the segfault at least since shadow memory is designed to work with memory in the [0x10007fff8000, 0x7fffffffffff] range (for high addresses), and you get mapped above the max address.
not sure exactly why you get mapped so high in memory compared to others, your environment looks pretty standard.
we can fix it by adding another memory range, but ideally we should determine why it happens imho.
Yes, I have tried to figure out why my server is able to successfully map such a high address, but I haven't figured it out yet. I have tried 3 machines, and only this larger one exhibits this behavior. I am also asking the configuration person for this server. If I get some useful information, I will be happy to share it here as soon as possible.
ok thanks. i tried to check online for this address (0xaaaaaaaaa000) but nothing interesting so far.