wtf OOM in Linux Mode

Hello, thanks for the great tool !

I am trying to reproduce a bug in libtiff 4.0.4 with linux mode, but I can't manage to get a proper snapshot of my VM because of an out of memory error.

Target

I downloaded and compiled libtiff with the following commands :

wget https://download.osgeo.org/libtiff/tiff-4.0.4.tar.gz && \
  tar -xzvf tiff-4.0.4.tar.gz && \
  rm tiff-4.0.4.tar.gz && \
  cd tiff-4.0.4 && \
  CC=clang \
    CXX=clang++ \
    CFLAGS='-ggdb -fsanitize=address' \
    CXXFLAGS='-ggdb -fsanitize=address' \
    ./configure --disable-shared --prefix=$PWD/build && \
  make -j $(nproc) && \
  make install

I then created the following GDB QEMU script :

import sys, os

# import fuzzing breakpoint
from gdb_fuzzbkpt import *

target_dir = "libtiff"

# address to break on, found using gdb
# break_address = "snapshot_here"
break_address = "TIFFClientOpen"

# name of the file in which to break
file_name = "tiffinfo"

# create the breakpoint for the executable specified
FuzzBkpt(target_dir, break_address, file_name, sym_path=file_name)

Environment

Tested on main branch version 0.5.5 :

 ➜ git log --name-status HEAD^..HEAD
commit a231e0a26cee29b0abc51466934f8796f89d2892 (HEAD -> main, tag: v0.5.5, origin/main, origin/HEAD)
Author: Axel Souchet <[email protected]>
Date:   Sat May 25 21:26:29 2024 -0700

    Update README.md

M       README.md

I created two scripts to simplify the snapshotting process :

linux_mode/libtiff/snapshot_client.sh :

#!/usr/bin/env bash

set -euo pipefail

QEMU_SNAPSHOT="../qemu_snapshot"
TARGET_VM="$QEMU_SNAPSHOT/target_vm"

TIFF_DIR="tiff-4.0.4"

# Compile tiffinfo
make -C $TIFF_DIR -j "$(nproc)"
make -C $TIFF_DIR install

TIFFINFO="$TIFF_DIR/build/bin/tiffinfo"

# Copy binary to pwd so GDB can read symbols from it
cp "$TIFFINFO" .
TIFFINFO="$PWD/tiffinfo"

# Copy binary and inputs to target_vm
pushd $TARGET_VM || exit
./scp.sh "$TIFFINFO"
popd || exit

# Run WTF client
$QEMU_SNAPSHOT/gdb_client.sh

linux_mode/libtiff/snapshot_server.sh :

#!/usr/bin/env bash

set -euo pipefail

QEMU_SNAPSHOT="../qemu_snapshot"

# Compile WTF
pushd ../../src/build/ || exit
./build-release.sh
popd || exit

# Run WTF server
$QEMU_SNAPSHOT/gdb_server.sh

Taking the snapshot

Launching first our server and running our program, we can successfully see our function getting break'ed on :

root@linux:~# ./tiffinfo -D -j -c -r -s -w logluv-3c-16b.tiff
[ 1618.924402] traps: tiffinfo[246] trap int3 ip:5555556c30b9 sp:7fffffffe940 error                                                                                      
Trace/breakpoint trap

But then repeating the operation with the client launched :

root@linux:~# ./tiffinfo -D -j -c -r -s -w logluv-3c-16b.tiff
[  111.043808] tiffinfo invoked oom-killer: gfp_mask=0x140dca(GFP_HIGHUSER_MOVABLE|              
[  111.044902] CPU: 0 UID: 0 PID: 222 Comm: tiffinfo Not tainted 6.12.0-rc1 #1                   
[  111.045527] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16              
[  111.046235] Call Trace:                                                                       
[  111.046470]  <TASK>                                                                           
[  111.046601]  dump_stack_lvl+0x53/0x70                                                         
[  111.046866]  dump_header+0x4b/0x3a0                                                           
[  111.047157]  ? do_try_to_free_pages+0x2aa/0x460                                               
[  111.047532]  ? ___ratelimit+0xa7/0x110                                                        
[  111.047881]  oom_kill_process+0x2ee/0x4b0                                                     
[  111.048289]  out_of_memory+0xec/0x700                                                         
[  111.048528]  __alloc_pages_noprof+0xdfc/0xfb0                                                 
[  111.048786]  alloc_pages_mpol_noprof+0x47/0xf0                                                
[  111.049047]  vma_alloc_folio_noprof+0x6c/0xc0                                                 
[  111.049301]  __handle_mm_fault+0x75f/0xce0                                                    
[  111.049695]  handle_mm_fault+0xc7/0x1f0                                                       
[  111.050023]  __get_user_pages+0x20f/0x1010                                                    
[  111.050346]  populate_vma_page_range+0x77/0xc0                                                
[  111.050725]  __mm_populate+0xfc/0x190                                                         
[  111.051020]  __do_sys_mlockall+0x199/0x1e0                                                    
[  111.051359]  do_syscall_64+0x9e/0x1a0                                                         
[  111.051590]  entry_SYSCALL_64_after_hwframe+0x77/0x7f                                         
[  111.051883] RIP: 0033:0x5555556c30b4                                                          
[  111.052093] Code: 8b 45 ec 50 53 51 52 55 57 56 41 50 41 51 41 52 41 53 41 54 41              
[  111.053341] RSP: 002b:00007fffffffe890 EFLAGS: 00000202 ORIG_RAX: 00000000000000              
[  111.053926] RAX: ffffffffffffffda RBX: 00005555556e42f0 RCX: 00005555556c30b4                 
[  111.054331] RDX: 0000000000000003 RSI: 00005555557688c0 RDI: 0000000000000003                 
[  111.054945] RBP: 00007fffffffe980 R08: 00005555556e4170 R09: 00005555556e4230                 
[  111.055561] R10: 00005555556e4540 R11: 0000000000000202 R12: 0000000000000000                 
[  111.056095] R13: 00007fffffffec80 R14: 00005555557c3170 R15: 00007ffff7ffd020                 
[  111.056591]  </TASK>                                                                          
[  111.056773] Mem-Info:                                                                         
[  111.056913] active_anon:44 inactive_anon:7900 isolated_anon:0
[  111.056913]  active_file:13 inactive_file:11 isolated_file:0                                  
[  111.056913]  unevictable:489214 dirty:8 writeback:0                                           
[  111.056913]  slab_reclaimable:1172 slab_unreclaimable:3721                                    
[  111.056913]  mapped:743 shmem:86 pagetables:1261                                              
[  111.056913]  sec_pagetables:0 bounce:0                                                        
[  111.056913]  kernel_misc_reclaimable:0                                                        
[  111.056913]  free:1193 free_pcp:583 free_cma:0                                                
[  111.059726] Node 0 active_anon:176kB inactive_anon:31600kB active_file:52kB inac              
[  111.061417] Node 0 DMA free:0kB boost:0kB min:40kB low:52kB high:64kB reserved_h              
[  111.063324] lowmem_reserve[]: 0 1958 0 0                                                      
[  111.063710] Node 0 DMA32 free:4772kB boost:0kB min:5640kB low:7644kB high:9648kB              
[  111.065838] lowmem_reserve[]: 0 0 0 0                                                         
[  111.066119] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB              
[  111.066729] Node 0 DMA32: 80*4kB (UME) 98*8kB (UE) 27*16kB (UME) 15*32kB (UE) 3*              
[  111.067832] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages              
[  111.068520] 837 total pagecache pages                                                         
[  111.068738] 0 pages in swap cache                                                             
[  111.068934] Free swap  = 0kB                                                                  
[  111.069119] Total swap = 0kB                                                                  
[  111.069294] 524158 pages RAM                                                                  
[  111.069549] 0 pages HighMem/MovableOnly                                                       
[  111.069906] 17929 pages reserved                                                              
[  111.070142] Tasks state (memory values in pages):                                             
[  111.070429] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem               
[  111.071062] [     80]     0    80     8234      640      256      383         1               
[  111.071964] [    103]     0   103    10134     2997     2690      307         0               
[  111.072754] [    139]     0   139     1005      325       32      293         0               
[  111.073534] [    160]     0   160     1435      464      192      272         0               
[  111.074374] [    189]     0   189      723      291        0      291         0               
[  111.075200] [    190]     0   190      723      291        0      291         0               
[  111.076269] [    191]     0   191      723      291        0      291         0               
[  111.077214] [    192]     0   192      723      291        0      291         0               
[  111.077934] [    193]     0   193      723      291        0      291         0               
[  111.078545] [    194]     0   194      723      291        0      291         0               
[  111.079183] [    195]     0   195     1194      407       96      311         0
[  111.080069] [    196]     0   196     3857      656      320      336         0               
[  111.080872] [    212]     0   212     1152      417      128      289         0               
[  111.081682] [    222]     0   222 5368723193   489212   488480      732                       
[  111.082382] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_al              
[  111.083392] Out of memory: Killed process 222 (tiffinfo) total-vm:21474892772kB,              
[  111.088055] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc'              
[  111.089002] clocksource:                       'kvm-clock' wd_nsec: 503935302 wd              
[  111.089844] clocksource:                       'tsc' cs_nsec: 699662209 cs_now:               
[  111.090637] clocksource:                       Clocksource 'tsc' skewed 19572690              
[  111.091889] clocksource:                       'kvm-clock' (not 'tsc') is curren              
[  111.092586] tsc: Marking TSC unstable due to clocksource watchdog                             
Killed

I tried increasing the VM allocated memory size but the OOM just takes longer to come. The client just hangs :

➜ ./snapshot_client.sh
...
~/dev/wtf/linux_mode/qemu_snapshot/target_vm ~/dev/wtf/linux_mode/libtiff
tiffinfo                                          100% 3599KB  69.1MB/s   00:00
~/dev/wtf/linux_mode/libtiff                                 
Reading symbols from ../qemu_snapshot/target_vm/linux/vmlinux...
Remote debugging using localhost:1234                        
native_irq_disable () at ./arch/x86/include/asm/irqflags.h:37
37              asm volatile("cli": : :"memory");            
add symbol table from file "tiffinfo" at                     
        .text_addr = 0x5555555974d0                          
Removing 'regs.json' file if it exists...                    
Hardware assisted breakpoint 1 at 0x5555556c30d0: file tif_open.c, line 93.
Using '/home/abel/dev/wtf/targets/libtiff' as target directory  
mkdir '/home/abel/dev/wtf/targets/libtiff'                   
mkdir '/home/abel/dev/wtf/targets/libtiff/crashes'           
mkdir '/home/abel/dev/wtf/targets/libtiff/inputs'            
mkdir '/home/abel/dev/wtf/targets/libtiff/outputs'           
mkdir '/home/abel/dev/wtf/targets/libtiff/state'             
Continuing.                                                  
In right process? True                                       
Calling mlockall                                             
Saving 67 bytes at 0x5555556c308d

This may be related to my binary, but on my host I don't have this problem at all with the same command. I have no idea how to debug this issue, if you could maybe guide me ? Let me know if I forgot some context/details.

Thanks !

Nov 12 '24 12:11 theo-abel

First of all, thank you for trying out the tool and for filing a very detailed issue 🥳

Tagging @jasocrow (one of the coauthor of the Linux mode) in case you've seen this before / you get what's going on.

I'll take a look in the next few days - hopefully we can figure out what's going on :)

Cheers

Nov 13 '24 05:11 0vercl0k

In the case where you don't increase the VM memory, the OOM gets triggered before the breakpoint gets hit? Also, this bit of output seems potentially interesting; do you know where @rip is pointing to?

[  111.051883] RIP: 0033:0x5555556c30b4                                                          
[  111.052093] Code: 8b 45 ec 50 53 51 52 55 57 56 41 50 41 51 41 52 41 53 41 54 41

In the next log, it looks like you do it the breakpoint but it seems to hang after the mlockall which is done by https://github.com/0vercl0k/wtf/blob/main/linux_mode/qemu_snapshot/gdb_fuzzbkpt.py#L436. What happens here is code is injected to call mlockall right before where your breakpoint is set; then when this executes you will land on your breakpoint a second time, at which point the shellcode is removed / the original bytes restored.

It looks like mlockall doesn't complete or something weird is happening 🤔

I guess one thing you can try is to manually add a mlockall call in your C++ target right before your breakpoint is executed and see what happens when you launch it in the VM. Maybe you can even place your breakpoint on a function that will NOT be executed just so that you can check if the syscall also is hanging in that context.

Does this make sense?

Cheers

Nov 15 '24 02:11 0vercl0k

Thank you for your time. It seems that the OOM gets triggered before the breakpoint :

➜ ./snapshot_client.sh
...
Hardware assisted breakpoint 1 at 0x5555556c30d0: file tif_open.c, line 93.
...

So our breakpoint is at 0x5555556c30d0 and our crashed RIP at 0x5555556c30b4. 0x5555556c30b4 – 0x5555556c30d0 = -0x1C = -28

Looking at the proc mapping and the symbol mapping

(gdb) info proc mappings
process 219
Mapped address spaces:

          Start Addr           End Addr       Size     Offset  Perms  objfile
      0x555555554000     0x555555597000    0x43000        0x0  r--p   /root/tiffinfo <---- In this segment
      0x555555597000     0x555555752000   0x1bb000    0x43000  r-xp   /root/tiffinfo
      0x555555752000     0x5555557c3000    0x71000   0x1fe000  r--p   /root/tiffinfo
      0x5555557c3000     0x5555557dd000    0x1a000   0x26f000  rw-p   /root/tiffinfo
      0x5555557dd000     0x555556132000   0x955000        0x0  rw-p
      0x7ffff7fc5000     0x7ffff7fc9000     0x4000        0x0  r--p   [vvar]
      0x7ffff7fc9000     0x7ffff7fcb000     0x2000        0x0  r-xp   [vdso]
      0x7ffff7fcb000     0x7ffff7fcc000     0x1000        0x0  r--p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7fcc000     0x7ffff7ff1000    0x25000     0x1000  r-xp   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7ff1000     0x7ffff7ffb000     0xa000    0x26000  r--p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7ffb000     0x7ffff7fff000     0x4000    0x30000  rw-p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffffffde000     0x7ffffffff000    0x21000        0x0  rw-p   [stack]
      0xffffffffff600000 0xffffffffff601000     0x1000        0x0  --xp   [vsyscall]

(gdb) info symbol 0x5555556c30b4
_TIFFgetMode + 356 in section .text of /root/tiffinfo

This is called in the TIFFClientOpen function :

TIFF *
TIFFClientOpen(
        const char *name, const char *mode,
        thandle_t clientdata,
        TIFFReadWriteProc readproc,
        TIFFReadWriteProc writeproc,
        TIFFSeekProc seekproc,
        TIFFCloseProc closeproc,
        TIFFSizeProc sizeproc,
        TIFFMapFileProc mapproc,
        TIFFUnmapFileProc unmapproc)
{
        static const char module[] = "TIFFClientOpen";
        TIFF *tif;
        int m;
        const char *cp;

        ...

        m = _TIFFgetMode(mode, module); <----- HERE
        if (m == -1)
                goto bad2;
        tif = (TIFF *)_TIFFmalloc((tmsize_t)(sizeof(TIFF) + strlen(name) + 1));
        if (tif == NULL)
        {
                TIFFErrorExt(clientdata, module, "%s: Out of memory (TIFF structure)", name);
                goto bad2;
        }
        ...

The _TIFFgetMode function looks like the following, and there is no memory allocation

int _TIFFgetMode(const char *mode, const char *module)
{
        int m = -1;

        switch (mode[0])
        {
        case 'r':
                m = O_RDONLY;
                if (mode[1] == '+')
                        m = O_RDWR;
                break;
        case 'w':
        case 'a':
                m = O_RDWR | O_CREAT;
                if (mode[0] == 'w')
                        m |= O_TRUNC;
                break;
        default:
                TIFFErrorExt(0, module, "\"%s\": Bad mode", mode);
                break;
        }
        return (m);
}

Calling `mlockall`

I added the following to my target and changed the breakpoint to something that will not run.

printf("Calling mlockall\n");
mlockall(3);

With both server and client opened, I ran :

root@linux:~# ./tiffinfo -D -j -c -r -s -w logluv-3c-16b.tiff
Calling mlockall
TIFF Directory at offset 0x10 (16)
...

Nothing happened on both ends. I tried taking the same arguments as in qemu_snapshot/gdb_fuzzbkpt.py.

Nov 18 '24 11:11 theo-abel

All right - it'll probably easier if I make a repro environment to experiment a bit / see if I run into the same issue. Will follow your instructions.

Cheers

Nov 19 '24 01:11 0vercl0k

Okay I've successfully set-up an environment and I'm able to see what you're seeing - thanks again for the detailed instructions!

Will update this issue when / once I know more.

Cheers

Nov 19 '24 04:11 0vercl0k

I've spent a bit of time on this but I still don't understand what's the issue. I've seen the same thing as you; artificially inserting a mlockall call & running tiffinfo w/o the client attached doesn't trigger the OOM somehow 🤔

Here are a few other things I've done:

In gdb_fuzzbkpt.py I've also changed the self.mlock = False to self.mlock = True in the ctor and I can confirm that the rest executes as expected:

In right process? True
In the QEMU tab, press Ctrl+C, run the `cpu` command
Detected cpu registers dumped to 'regs.json'
Connecting to Qemu monitor at localhost:55555
Connected
Instructing Qemu to dump physical memory into 'raw'
Done
Converting raw file 'raw' to dump file '/home/over/wtf/targets/libtiff/state/mem.dmp'
Done
mv 'regs.json' '/home/over/wtf/targets/libtiff/state/regs.json'
mv 'symbol-store.json' '/home/over/wtf/targets/libtiff/state/symbol-store.json'
Snapshotting complete

Breakpoint 1, TIFFClientOpen (name=<error reading variable: Cannot access memory at address 0x7ffffffff018>, mode=<error reading variable: Cannot access memo

I also thought it could be because of ASAN because of the (large / sparse?) shadow memory but I still trigger the same issue w/o instrumenting the binary with ASAN.

At this point I wonder if there's not a bug with the shellcode that gets injected in the target; at least I'm going to spend a bit more time trying to do that.

Cheers

Nov 26 '24 02:11 0vercl0k

Thanks for the detailed report. I haven't come across this issue before. I'll plan to take a look at this soon.

Nov 29 '24 18:11 jasocrow

I've tried yesterday to insert an int3 right after the syscall is done, but it doesn't seem to return. Actually in the OOM call-stack we can see that it triggers while calling mlockall 🤦🏽‍♂️

[  111.046235] Call Trace:                                                                       
[  111.046470]  <TASK>                                                                           
[  111.046601]  dump_stack_lvl+0x53/0x70                                                         
[  111.046866]  dump_header+0x4b/0x3a0                                                           
[  111.047157]  ? do_try_to_free_pages+0x2aa/0x460                                               
[  111.047532]  ? ___ratelimit+0xa7/0x110                                                        
[  111.047881]  oom_kill_process+0x2ee/0x4b0                                                     
[  111.048289]  out_of_memory+0xec/0x700                                                         
[  111.048528]  __alloc_pages_noprof+0xdfc/0xfb0                                                 
[  111.048786]  alloc_pages_mpol_noprof+0x47/0xf0                                                
[  111.049047]  vma_alloc_folio_noprof+0x6c/0xc0                                                 
[  111.049301]  __handle_mm_fault+0x75f/0xce0                                                    
[  111.049695]  handle_mm_fault+0xc7/0x1f0                                                       
[  111.050023]  __get_user_pages+0x20f/0x1010                                                    
[  111.050346]  populate_vma_page_range+0x77/0xc0                                                
[  111.050725]  __mm_populate+0xfc/0x190                                                         
[  111.051020]  __do_sys_mlockall+0x199/0x1e0

So at this point, it seems unlikely that the shellcode injection is to blame.. back to the drawing board I think.

Cheers

Dec 03 '24 02:12 0vercl0k

Okay so after checking the disassembly of the mlockall(3) compiled in tiffinfo I realized ASAN intercepts it and makes it a no operation basically (at least it would appear so on my build):

// Linux kernel has a bug that leads to kernel deadlock if a process
// maps TBs of memory and then calls mlock().
static void MlockIsUnsupported() {
  static atomic_uint8_t printed;
  if (atomic_exchange(&printed, 1, memory_order_relaxed))
    return;
  VPrintf(1, "%s ignores mlock/mlockall/munlock/munlockall\n",
          SanitizerToolName);
}

INTERCEPTOR(int, mlock, const void *addr, uptr len) {
  MlockIsUnsupported();
  return 0;
}

😅

Do you see the same?

(gdb) disass mlockall
Dump of assembler code for function mlockall:
   0x0000000000096b10 <+0>:     push   rax
   0x0000000000096b11 <+1>:     mov    al,0x1
   0x0000000000096b13 <+3>:     xchg   BYTE PTR [rip+0x1ed7bf],al        # 0x2842d8 <MlockIsUnsupported()::printed>
   0x0000000000096b19 <+9>:     test   al,al
   0x0000000000096b1b <+11>:    je     0x96b21 <mlockall+17>
   0x0000000000096b1d <+13>:    xor    eax,eax
   0x0000000000096b1f <+15>:    pop    rcx
   0x0000000000096b20 <+16>:    ret
   0x0000000000096b21 <+17>:    lea    rax,[rip+0x27dad0]        # 0x3145f8 <__sanitizer::current_verbosity>
   0x0000000000096b28 <+24>:    cmp    DWORD PTR [rax],0x0
   0x0000000000096b2b <+27>:    je     0x96b1d <mlockall+13>
   0x0000000000096b2d <+29>:    lea    rax,[rip+0x1c4714]        # 0x25b248 <__sanitizer::SanitizerToolName>
   0x0000000000096b34 <+36>:    mov    rsi,QWORD PTR [rax]
   0x0000000000096b37 <+39>:    lea    rdi,[rip+0x15a7c7]        # 0x1f1305
   0x0000000000096b3e <+46>:    xor    eax,eax
   0x0000000000096b40 <+48>:    call   0xe0590 <__sanitizer::Printf(char const*, ...)>
   0x0000000000096b45 <+53>:    xor    eax,eax
   0x0000000000096b47 <+55>:    pop    rcx
   0x0000000000096b48 <+56>:    ret

If I insert this line instead..:

TIFFClientOpen(
        const char* name, const char* mode,
        thandle_t clientdata,
        TIFFReadWriteProc readproc,
        TIFFReadWriteProc writeproc,
        TIFFSeekProc seekproc,
        TIFFCloseProc closeproc,
        TIFFSizeProc sizeproc,
        TIFFMapFileProc mapproc,
        TIFFUnmapFileProc unmapproc
) {
        printf("Calling mlockall!\n");
        // mlockall(3);
        syscall(0x97, 3);
        printf("Done\n");

..I get a consistent behavior with gdb attached or not in the QEMU environment (YAY):

root@linux:~# ./tiffinfo -D -j -c -r -s -w logluv-3c-16b.tiff
Calling mlockall!
[ 1131.810780] tiffinfo invoked oom-killer: gfp_mask=0x140dca(GFP_HIGHUSER_MOVABLE|__GFP_COMP|__GFP_ZERO), order=0, oom_score_adj=0
[ 1131.813550] CPU: 0 UID: 0 PID: 305 Comm: tiffinfo Not tainted 6.12.0 #1
[ 1131.815577] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 1131.817966] Call Trace:
[ 1131.818425]  <TASK>
[ 1131.819045]  dump_stack_lvl+0x55/0x70
[ 1131.820101]  dump_header+0x4b/0x3a0
[ 1131.821132]  ? get_page_from_freelist+0x6b6/0x1000
[ 1131.822398]  ? ___ratelimit+0x9b/0x110
[ 1131.823516]  oom_kill_process+0x2f3/0x4c0
[ 1131.824582]  out_of_memory+0xed/0x6b0
[ 1131.825448]  __alloc_pages_noprof+0xe66/0x1010
[ 1131.826217]  alloc_pages_mpol_noprof+0x43/0xf0
[ 1131.827085]  vma_alloc_folio_noprof+0x61/0xc0
[ 1131.828062]  __handle_mm_fault+0x687/0xcd0
[ 1131.829145]  handle_mm_fault+0xca/0x220
[ 1131.829924]  __get_user_pages+0x273/0xfa0
[ 1131.830822]  populate_vma_page_range+0x78/0xc0
[ 1131.831946]  __mm_populate+0x75/0x1a0
[ 1131.833369]  __do_sys_mlockall+0x1a2/0x1e0
[ 1131.834793]  do_syscall_64+0x9e/0x1a0
[ 1131.835702]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 1131.837144] RIP: 0033:0x7ffff7dc1799
[ 1131.838046] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 08
[ 1131.843240] RSP: 002b:00007fffffffdf98 EFLAGS: 00000202 ORIG_RAX: 0000000000000097
[ 1131.845040] RAX: ffffffffffffffda RBX: 00007fffffffe000 RCX: 00007ffff7dc1799
[ 1131.846519] RDX: 0000000000000000 RSI: 00007ffff7cbf080 RDI: 0000000000000003
[ 1131.848630] RBP: 00007fffffffe7f0 R08: 0000000000000003 R09: 0000000041b58ab3
[ 1131.850573] R10: 0000000000000010 R11: 0000000000000202 R12: 0000000000000000
[ 1131.852079] R13: 00007fffffffec80 R14: 00005555556d24f0 R15: 00007ffff7ffd020
[ 1131.853475]  </TASK>
[ 1131.854074] Mem-Info:
[ 1131.854683] active_anon:44 inactive_anon:7906 isolated_anon:0
[ 1131.854683]  active_file:11 inactive_file:11 isolated_file:0
[ 1131.854683]  unevictable:489405 dirty:0 writeback:1
[ 1131.854683]  slab_reclaimable:1174 slab_unreclaimable:3673
[ 1131.854683]  mapped:897 shmem:86 pagetables:1288
[ 1131.854683]  sec_pagetables:0 bounce:0
[ 1131.854683]  kernel_misc_reclaimable:0
[ 1131.854683]  free:1350 free_pcp:239 free_cma:0
[ 1131.863596] Node 0 active_anon:176kB inactive_anon:31624kB active_file:44kB inactive_file:44kB unevictable:1957620kB isolated(anon):0kB isolated(file):0ks
[ 1131.869982] Node 0 DMA free:0kB boost:0kB min:40kB low:52kB high:64kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_B
[ 1131.877514] lowmem_reserve[]: 0 1958 0 0
[ 1131.878352] Node 0 DMA32 free:5400kB boost:0kB min:5640kB low:7644kB high:9648kB reserved_highatomic:0KB active_anon:176kB inactive_anon:31624kB active_fB
[ 1131.885478] lowmem_reserve[]: 0 0 0 0
[ 1131.886489] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
[ 1131.889327] Node 0 DMA32: 144*4kB (UE) 45*8kB (UE) 7*16kB (UME) 10*32kB (UM) 5*64kB (U) 5*128kB (UM) 0*256kB 0*512kB 1*1024kB (M) 1*2048kB (M) 0*4096kB =B
[ 1131.893612] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 1131.895580] 1008 total pagecache pages
[ 1131.896235] 0 pages in swap cache
[ 1131.896764] Free swap  = 0kB
[ 1131.897566] Total swap = 0kB
[ 1131.898463] 524158 pages RAM
[ 1131.899329] 0 pages HighMem/MovableOnly
[ 1131.900431] 17929 pages reserved
[ 1131.901335] Tasks state (memory values in pages):
[ 1131.902794] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[ 1131.906516] [     81]     0    81     8233      686      256      429         1    81920        0          -250 systemd-journal
[ 1131.910169] [    104]     0   104    10136     3037     2690      347         0    90112        0         -1000 systemd-udevd
[ 1131.913168] [    141]     0   141     1005      388       32      356         0    32768        0             0 cron
[ 1131.916016] [    163]     0   163     1435      495      192      303         0    40960        0             0 dhclient
[ 1131.918817] [    192]     0   192      723      322        0      322         0    32768        0             0 agetty
[ 1131.921616] [    193]     0   193      723      322        0      322         0    32768        0             0 agetty
[ 1131.923956] [    194]     0   194      723      322        0      322         0    32768        0             0 agetty
[ 1131.926595] [    195]     0   195      723      322        0      322         0    32768        0             0 agetty
[ 1131.928819] [    196]     0   196      723      322        0      322         0    32768        0             0 agetty
[ 1131.931176] [    197]     0   197      723      322        0      322         0    32768        0             0 agetty
[ 1131.933971] [    198]     0   198     1169      470       96      374         0    36864        0             0 login
[ 1131.936607] [    199]     0   199     3857      669      320      349         0    57344        0         -1000 sshd
[ 1131.939964] [    215]     0   215     1152      448      128      320         0    36864        0             0 bash
[ 1131.942525] [    305]     0   305 5368720269   489376   488480      896         0  4259840        0             0 tiffinfo
[ 1131.944545] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,task=tiffinfo,pid=305,uid=0
[ 1131.947315] Out of memory: Killed process 305 (tiffinfo) total-vm:21474881076kB, anon-rss:1953920kB, file-rss:3584kB, shmem-rss:0kB, UID:0 pgtables:4160k0
Killed

So at least, now we have the same behavior w/ & w/o gdb which makes more sense; making progress!

@jasocrow have you tried running the Linux mode on ASAN instrumented binaries by any chance?

Cheers

Dec 06 '24 02:12 0vercl0k

Just curious @jasocrow , How important is it to call mlockall inside the process? I don't believe the VM image uses any swap and I suspect that address sanitizer allocates a fair bit of memory for its shadow pages that is never expected to physically reside anywhere and is likely the main reason why they intercept all mlock related calls (since mlocking TBs of memory will most certianly OOM the machine).

I created a quick example using LLVM's address sanitizer in clang:

#include <stdio.h>
#include <stdlib.h>

void do_crash_test(char* input) {

    if (input[0] == 'e') {
         if (input[1] == 's') {
             if (input[2] == 'c') {
                *(char*)NULL = 1;
             }
         }
    }


    if (input[0] == '1') {
         if (input[1] == '2') {
             if (input[2] == '3') {
                 char* x = malloc(11);
                 x[12] = 1;
             }
         }
    }
}

void end_crash_test() { printf("End crash test.\n"); }

int main(int argc, char* argv[]) {
    char* buf = NULL;
    size_t cbBuf = 10;
    ssize_t cbRead = 0;

    buf = (char*)calloc(1, cbBuf);
    if (!buf) {
        printf("calloc failed.\n");
        goto END;
    }

    printf("Enter some input.\n");
    cbRead = getline(&buf, &cbBuf, stdin);
    if (-1 == cbRead) {
        perror("getline failure: ");
        goto END;
    }

    do_crash_test(buf);

    end_crash_test();

END:
    if (buf) {
        free(buf);
    }
}

compiled using:

root@553688c88245:~/wtf/linux_mode/crash_test# clang -v                           
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Candidate multilib: .;@m64
Selected multilib: .;@m64
root@553688c88245:~/wtf/linux_mode/crash_test# clang -fsanitize=address -g  test.c
test.c:12:17: warning: indirection of non-volatile null pointer will be deleted, not trap [-Wnull-dereference]
                *(char*)NULL = 1;
                ^~~~~~~~~~~~
test.c:12:17: note: consider using __builtin_trap() or qualifying pointer with 'volatile'
1 warning generated.
root@553688c88245:~/wtf/linux_mode/crash_test#

Running that code above using the gdb scripts calling mlockall does indeed OOM the machine. However when I remove the mlockall code from stop() inside of gdb_fuzzbkpt.py:

        #if not self.mlock:
            #self.call_mlockall()
            #self.mlock = True
            #return False

        #self.restore_orig_bytes()

The snapshot passes and the fuzzer finds both the null deref and OOB-write:

root@553688c88245:~/wtf/targets/linux_crash_test/outputs# xxd crash-dbb2ce9ff9c616ed4fbf9808d238a7f0 
00000000: 3132 3333 3333 610a                      123333a.
root@553688c88245:~/wtf/targets/linux_crash_test/outputs# xxd crash-5a9080d654dcc4b6d9111d25a01b4258 
00000000: 6573 6373 7373 0a                        escsss.
root@553688c88245:~/wtf/targets/linux_crash_test/outputs#

I also thought it could be because of ASAN because of the (large / sparse?) shadow memory but I still trigger the same issue w/o instrumenting the binary with ASAN.

I'm not sure how much non-committed (over-commited?) memory libtiff allocates but if its also unreasonably large to force it into physical memory via mlock then i'd expect you'd likely see the same issue? idk. I'm also keen on getting ASAN to work reliably here :)

¯\_(ツ)_/¯

best, defparam

Jan 04 '25 18:01 defparam

Yeah I'm kind of wondering if we just shouldn't disable mlock if ASAN is detected 🤔

I personally haven't played too much with the Linux stuff so I'd leave it to the people that have used it more to know if it's the right call.

Thanks for sharing your experiments @defparam 🙏🏽

Cheers

Jan 05 '25 17:01 0vercl0k

Sorry for the late response. I haven't tried fuzzing using ASAN.

mlock has been necessary for the fuzzing I've done. Without calling mlock, we were seeing page faults during fuzzing that wtf couldn't handle, since wtf does not have a disk emulator.

Jan 17 '25 15:01 jasocrow

Quick strategic thought on this and when I have time I may test the idea. But instead of mlockall is it possible to write a function that traverses all process memory locking only the pages that are non-sparse. If possible that should in theory fix the OOM AND allow for ASAN with all real data in RAM.

May 09 '25 15:05 defparam

Lemme know if you give this a try - I am curious to hear about the results & how you achieved that :)

Cheers

May 13 '25 01:05 0vercl0k