heap-buffer-overflow in valkey-check-aof
I noticed this leak when running integration/aof tests with ASAN:
==1433==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xffffa8958595 at pc 0xffffac1d649c bp 0xffffc838d120 sp 0xffffc838c8f8
READ of size 3 at 0xffffa8958595 thread T0
#0 0xffffac1d649b (/lib64/libasan.so.4+0x5c49b)
#1 0x62b1b3 in consumeNewline local/redis/src/valkey-check-aof.c:64
#2 0x62b1b3 in readString local/redis/src/valkey-check-aof.c:117
#3 0x62b50f in processRESP local/redis/src/valkey-check-aof.c:144
#4 0x6549df in checkSingleAof local/redis/src/valkey-check-aof.c:276
#5 0x655397 in check...
AOF appendonly.aof.1.incr.aof format error
AOF analyzed: filename=appendonly.aof.1.incr.aof, size=36, ok_up_to=33, ok_up_to_line=8, diff=3
This will shrink the AOF appendonly.aof.1.incr.aof from 36 bytes, with 3 bytes, to 33 bytes
Continue? [y/N]: Successfully truncated AOF appendonly.aof.1.incr.aof
All AOF files and manifest are valid
=================================================================
==1433==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xffffa8958595 at pc 0xffffac1d649c bp 0xffffc838d120 sp 0xffffc838c8f8
READ of size 3 at 0xffffa8958595 thread T0
#0 0xffffac1d649b (/lib64/libasan.so.4+0x5c49b)
#1 0x62b1b3 in consumeNewline local/redis/src/valkey-check-aof.c:64
#2 0x62b1b3 in readString local/redis/src/valkey-check-aof.c:117
#3 0x62b50f in processRESP local/redis/src/valkey-check-aof.c:144
#4 0x6549df in checkSingleAof local/redis/src/valkey-check-aof.c:276
#5 0x655397 in checkMultiPartAof local/redis/src/valkey-check-aof.c:497
#6 0x6556c3 in redis_check_aof_main local/redis/src/valkey-check-aof.c:558
#7 0xb420b3 in redisMain local/redis/src/server.c:9352
#8 0xffffabcf5da3 in __libc_start_main (/lib64/libc.so.6+0x1fda3)
#9 0x5b802f (local/redis/src/valkey-check-aof+0x5b802f)
0xffffa8958595 is located 0 bytes to the right of 5-byte region [0xffffa8958590,0xffffa8958595)
allocated by thread T0 here:
#0 0xffffac255db3 in __interceptor_malloc (/lib64/libasan.so.4+0xdbdb3)
#1 0xb3372f in ztrymalloc_usable_internal.lto_priv.636 local/redis/src/zmalloc.c:158
#2 0xb337bf in zmalloc local/redis/src/zmalloc.c:190
#3 0x62afff in readString local/redis/src/valkey-check-aof.c:111
#4 0x62b50f in processRESP local/redis/src/valkey-check-aof.c:144
#5 0x6549df in checkSingleAof local/redis/src/valkey-check-aof.c:276
#6 0x655397 in checkMultiPartAof local/redis/src/valkey-check-aof.c:497
#7 0x6556c3 in redis_check_aof_main local/redis/src/valkey-check-aof.c:558
#8 0xb420b3 in redisMain local/redis/src/server.c:9352
#9 0xffffabcf5da3 in __libc_start_main (/lib64/libc.so.6+0x1fda3)
#10 0x5b802f (local/redis/src/valkey-check-aof+0x5b802f)
SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib64/libasan.so.4+0x5c49b)
Shadow bytes around the buggy address:
0x200ff512b060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff512b070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff512b080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff512b090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff512b0a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x200ff512b0b0: fa fa[05]fa fa fa fd fa fa fa fd fa fa fa fd fa
0x200ff512b0c0: fa fa fd fd fa fa fd fa fa fa fd fa fa fa fd fa
0x200ff512b0d0: fa fa fd fa fa fa fd fd fa fa fd fa fa fa fd fa
0x200ff512b0e0: fa fa fd fd fa fa fd fd fa fa fd fa fa fa fd fa
0x200ff512b0f0: fa fa fd fa fa fa fd fd fa fa fd fa fa fa fd fa
0x200ff512b100: fa fa 06 fa fa fa 00 fa fa fa 00 00 fa fa fd fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb.
The issue is this line https://github.com/valkey-io/valkey/blob/unstable/src/valkey-check-aof.c#L66, which is reading more than the intended 2 bytes. Although we specify a 2-byte comparison, strncmp can read at register width (typically 4 bytes) for performance optimization, which means it may access memory beyond our allocated buffer. Since we don't null-terminate the buffer until after the consumeNewline function executes on line 122, this over-reading can access unallocated memory addresses and trigger a heap-buffer-overflow. https://stackoverflow.com/questions/38878195/does-this-usage-of-strncmp-contain-an-out-of-bounds-read
This was fixed locally by replacing that line with byte-by-byte comparison rather than using a strncmp.
I can work on this if assigned
Title is incorrect. This isn't a "leak", it's a heap-buffer-overflow
Hi Murad. I found this one interesting and looked into it a bit. While it's clear that there's an ASAN report here, and the changed code looks valid, I'm having difficulty buying into the explanation.
Reviewing the code itself, it's pretty clear that the buffer is allocated and that the comparison is within the allocated space. I don't see any path where this could be incorrect. At least to my eyes, this appears to be a false positive. But why?
From the traceback, we can see that strncmp is being referenced from within libasan.so - so this is a custom version of strncmp which would avoid tricky performance optimizations. This appears to be the proper code reference, https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libsanitizer/sanitizer_common/sanitizer_libc.cpp;h=9318066afed20c088f0ee0ece176a6e5d14a0876;hb=1161fc635450b529159a3804c24d85d94ecb672b#l137
From that code reference, there should be no way that we would access an additional byte, past the 2 bytes - regardless if there is an embedded null or not.
The only thing I can think is that maybe the compiler is seeing the access pattern inside strncmp and generating optimized code that accesses in words - and maybe asan's code instrumentation is catching that?
In any case though, your change is logically correct.
Hi @JimB123 . Thanks a lot for your comment. When I first encountered the issue I assumed it's due to compiler reading the string at register-width, I didn't even notice that strncmp was called from ASAN: #0 0xffffac1d649b (/lib64/libasan.so.4+0x5c49b)
My understanding is that this line in the stack trace doesn’t necessarily mean that strncmp itself was replaced by ASAN — it just indicates that ASAN detected the out-of-bounds access during runtime instrumentation. But i'm not too experienced with c++ or ASAN, so I wanted to clarify whether you are sure that strncmp is being replaced by ASAN's custom function?
I'm certainly not an expert in this area. But everything I'm reading indicates that YES, ASAN will intercept calls to strncmp.
It looks like this file is defining the interceptors: https://github.com/gcc-mirror/gcc/blob/ca23873aed881901a967372e61ee64634d634970/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc#L271