redisraft icon indicating copy to clipboard operation
redisraft copied to clipboard

Fuzzing redisraft crash when polling client connections

Open ds-testing-user opened this issue 2 years ago • 0 comments

Following directions received on the main redis repo. Opening an issue here since we could not replicate the issue without the raft module - https://github.com/redis/redis/issues/12679

The bug occurs when closing a client connection on a 3 node redis cluster. The trace is below

logs for node: 3

----- Stdout -----
1189251:C 17 Oct 2023 18:54:37.144 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1189251:C 17 Oct 2023 18:54:37.144 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1189251:C 17 Oct 2023 18:54:37.144 * Redis version=255.255.255, bits=64, commit=e04ebdb8, modified=1, pid=1189251, just started
1189251:C 17 Oct 2023 18:54:37.144 * Configuration loaded
1189251:M 17 Oct 2023 18:54:37.144 * monotonic clock: POSIX clock_gettime
1189251:M 17 Oct 2023 18:54:37.145 * Running mode=standalone, port=5083.
1189251:M 17 Oct 2023 18:54:37.149 * <raft> RedisRaft version 255.255.255 [c259f3d2]
1189251:M 17 Oct 2023 18:54:37.153 * <raft> Initializing test network
1189251:M 17 Oct 2023 18:54:37.153 * <raft> Created test network client
1189251:M 17 Oct 2023 18:54:37.158 * <raft> Raft module loaded, state is 'loading'
1189251:M 17 Oct 2023 18:54:37.158 * Module 'raft' loaded from /Fuzzing/redisraft-fuzzing/redisraft.so
1189251:M 17 Oct 2023 18:54:37.158 * Server initialized
1189251:M 17 Oct 2023 18:54:37.158 . The AOF directory appendonlydir doesn't exist
1189251:M 17 Oct 2023 18:54:37.158 * Ready to accept connections tcp
1189251:M 17 Oct 2023 18:54:37.158 . 0 clients connected (0 replicas), 1746688 bytes in use
1189251:M 17 Oct 2023 18:54:37.229 - Accepted 127.0.0.1:46798
1189251:M 17 Oct 2023 18:54:37.229 # Missing implement of connection type tls
1189251:M 17 Oct 2023 18:54:37.237 - Accepted 127.0.0.1:46804
1189251:M 17 Oct 2023 18:54:37.258 * <raft> Loading: Redis loading complete, snapshot NOT LOADED
1189251:M 17 Oct 2023 18:54:37.258 * <raft> Cluster Membership: term:0 index:0 nodes: id=3,voting=0,active=1,addr=-
1189251:M 17 Oct 2023 18:54:37.259 * <raft> Test network starting message polling thread
1189251:M 17 Oct 2023 18:54:37.259 * <raft> Raft state after loading log: log_count=6, first_idx=1, current_idx=6, last_applied_idx=0
1189251:M 17 Oct 2023 18:54:37.259 . <raft> Raft term=1, vote=-1


=== REDIS BUG REPORT START: Cut & paste starting from here ===
1189251:M 17 Oct 2023 18:54:37.259 # Redis 255.255.255 crashed by signal: 11, si_code: 1
1189251:M 17 Oct 2023 18:54:37.259 # Accessing address: 0x670000000000
1189251:M 17 Oct 2023 18:54:37.259 # Crashed running the instruction at: 0x562def40b8c8

------ STACK TRACE ------
EIP:
/Fuzzing/redis/src/redis-server 0.0.0.0:5083(clientsCronResizeQueryBuffer+0x8)[0x562def40b8c8]

Backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x13140)[0x7f91683e8140]
/Fuzzing/redis/src/redis-server 0.0.0.0:5083(clientsCronResizeQueryBuffer+0x8)[0x562def40b8c8]
/Fuzzing/redis/src/redis-server 0.0.0.0:5083(clientsCron+0x16c)[0x562def40c10c]
/Fuzzing/redis/src/redis-server 0.0.0.0:5083(serverCron+0x445)[0x562def40cc65]
/Fuzzing/redis/src/redis-server 0.0.0.0:5083(+0x81705)[0x562def3ff705]
/Fuzzing/redis/src/redis-server 0.0.0.0:5083(aeMain+0x1c8)[0x562def409cb8]
/Fuzzing/redis/src/redis-server 0.0.0.0:5083(main+0x3cd)[0x562def3fee2d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7f9168224d0a]
/Fuzzing/redis/src/redis-server 0.0.0.0:5083(_start+0x2a)[0x562def3ff5aa]

------ REGISTERS ------
1189251:M 17 Oct 2023 18:54:37.260 # 
RAX:0000000000000000 RBX:00007f91664c8130
RCX:0000670000000001 RDX:0000000000000000
RDI:00007f91664c8130 RSI:0000000000000002
RBP:0000018b3e8f8dcb RSP:00007ffd6d71dad0
R8 :0000000000000380 R9 :0000000000000000
R10:0000000000000000 R11:0000000000000000
R12:00000000652ebc4d R13:0000000000000005
R14:0000562def6e8aa0 R15:00007f91664c8130
RIP:0000562def40b8c8 EFL:0000000000010246
CSGSFS:002b000000000033
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dadf) -> 0000000000000001
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dade) -> 00007ffd6d71db90
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dadd) -> 0000562def40cc65
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dadc) -> 000006ffa0a05700
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dadb) -> 000000000000000a
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dada) -> 0000000000000001
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad9) -> 0000000000000000
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad8) -> 0000000000000042
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad7) -> 0000562def6e8aa0
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad6) -> 000006ffa0a05700
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad5) -> 000000000003f53f
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad4) -> 00000000652ebc4d
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad3) -> 000000006d71db18
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad2) -> 0000000000001000
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad1) -> 0000562def40c10c
1189251:M 17 Oct 2023 18:54:37.260 # (00007ffd6d71dad0) -> 0000000000000001

------ INFO OUTPUT ------

----- Stderr -----




logs for node: 1

----- Stdout -----
1189468:C 17 Oct 2023 18:54:37.420 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1189468:C 17 Oct 2023 18:54:37.420 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1189468:C 17 Oct 2023 18:54:37.420 * Redis version=255.255.255, bits=64, commit=e04ebdb8, modified=1, pid=1189468, just started
1189468:C 17 Oct 2023 18:54:37.420 * Configuration loaded
1189468:M 17 Oct 2023 18:54:37.421 * monotonic clock: POSIX clock_gettime
1189468:M 17 Oct 2023 18:54:37.421 * Running mode=standalone, port=5081.
1189468:M 17 Oct 2023 18:54:37.425 * <raft> RedisRaft version 255.255.255 [c259f3d2]
1189468:M 17 Oct 2023 18:54:37.429 * <raft> Initializing test network
1189468:M 17 Oct 2023 18:54:37.430 * <raft> Created test network client
1189468:M 17 Oct 2023 18:54:37.434 * <raft> Raft module loaded, state is 'loading'
1189468:M 17 Oct 2023 18:54:37.434 * Module 'raft' loaded from /Fuzzing/redisraft-fuzzing/redisraft.so
1189468:M 17 Oct 2023 18:54:37.434 * Server initialized
1189468:M 17 Oct 2023 18:54:37.434 . The AOF directory appendonlydir doesn't exist
1189468:M 17 Oct 2023 18:54:37.434 * Ready to accept connections tcp
1189468:M 17 Oct 2023 18:54:37.434 . 0 clients connected (0 replicas), 1746688 bytes in use

----- Stderr -----




logs for node: 2

----- Stdout -----
1188305:C 17 Oct 2023 18:54:35.799 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1188305:C 17 Oct 2023 18:54:35.799 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1188305:C 17 Oct 2023 18:54:35.799 * Redis version=255.255.255, bits=64, commit=e04ebdb8, modified=1, pid=1188305, just started
1188305:C 17 Oct 2023 18:54:35.799 * Configuration loaded
1188305:M 17 Oct 2023 18:54:35.799 * monotonic clock: POSIX clock_gettime
1188305:M 17 Oct 2023 18:54:35.800 * Running mode=standalone, port=5082.
1188305:M 17 Oct 2023 18:54:35.804 * <raft> RedisRaft version 255.255.255 [c259f3d2]
1188305:M 17 Oct 2023 18:54:35.808 * <raft> Initializing test network
1188305:M 17 Oct 2023 18:54:35.809 * <raft> Created test network client
1188305:M 17 Oct 2023 18:54:35.819 * <raft> Raft module loaded, state is 'loading'
1188305:M 17 Oct 2023 18:54:35.819 * Module 'raft' loaded from /Fuzzing/redisraft-fuzzing/redisraft.so
1188305:M 17 Oct 2023 18:54:35.820 * Server initialized
1188305:M 17 Oct 2023 18:54:35.820 . The AOF directory appendonlydir doesn't exist
1188305:M 17 Oct 2023 18:54:35.820 * Ready to accept connections tcp
1188305:M 17 Oct 2023 18:54:35.820 . 0 clients connected (0 replicas), 1746656 bytes in use
1188305:M 17 Oct 2023 18:54:35.832 - Accepted 127.0.0.1:46372
1188305:M 17 Oct 2023 18:54:35.832 # Missing implement of connection type tls
1188305:M 17 Oct 2023 18:54:35.920 * <raft> Loading: Redis loading complete, snapshot NOT LOADED
1188305:M 17 Oct 2023 18:54:35.920 * <raft> Cluster Membership: term:0 index:0 nodes: id=2,voting=0,active=1,addr=-
1188305:M 17 Oct 2023 18:54:35.921 * <raft> Test network starting message polling thread
1188305:M 17 Oct 2023 18:54:35.921 * <raft> Cluster Membership: term:0 index:4 nodes: id=2,voting=0,active=1,addr=- id=1,voting=1,active=1,addr=localhost:5081
1188305:M 17 Oct 2023 18:54:35.922 * <raft> Cluster Membership: term:0 index:4 nodes: id=2,voting=0,active=1,addr=- id=1,voting=1,active=1,addr=localhost:5081 id=3,voting=0,active=1,addr=localhost:5083
1188305:M 17 Oct 2023 18:54:35.922 * <raft> Raft state after loading log: log_count=4, first_idx=1, current_idx=4, last_applied_idx=0
1188305:M 17 Oct 2023 18:54:35.922 . <raft> Raft term=1, vote=-1
1188305:M 17 Oct 2023 18:54:36.825 * <raft> State change: Election starting, node is now a pre-candidate, term 1
1188305:M 17 Oct 2023 18:54:36.825 * <raft> Cluster Membership: term:1 index:6 nodes: id=2,voting=0,active=1,addr=- id=1,voting=1,active=1,addr=localhost:5081 id=3,voting=1,active=1,addr=localhost:5083
1188305:M 17 Oct 2023 18:54:36.835 - Client closed connection id=8 addr=127.0.0.1:46372 laddr=127.0.0.1:5082 fd=11 name= age=1 idle=1 flags=N db=0 sub=0 psub=0 ssub=0 multi=-1 qbuf=0 qbuf-free=20474 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=22400 events=r cmd=auth user=default redir=-1 resp=2 lib-name= lib-ver=
1188305:M 17 Oct 2023 18:54:37.137 - Accepted 127.0.0.1:46376
1188305:M 17 Oct 2023 18:54:37.228 * <raft> State change: Election starting, node is now a pre-candidate, term 1
1188305:M 17 Oct 2023 18:54:37.228 * <raft> Cluster Membership: term:1 index:6 nodes: id=2,voting=0,active=1,addr=- id=1,voting=1,active=1,addr=localhost:5081 id=3,voting=1,active=1,addr=localhost:5083
1188305:M 17 Oct 2023 18:54:37.387 - Client closed connection id=9 addr=127.0.0.1:46376 laddr=127.0.0.1:5082 fd=12 name= age=0 idle=0 flags=N db=0 sub=0 psub=0 ssub=0 multi=-1 qbuf=0 qbuf-free=20474 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=22400 events=r cmd=auth user=default redir=-1 resp=2 lib-name= lib-ver=

----- Stderr -----

We have been able to replicate the issue on different machines

ds-testing-user avatar Oct 20 '23 09:10 ds-testing-user