KeyDB icon indicating copy to clipboard operation
KeyDB copied to clipboard

[CRASH] Immediately after adding active instances in the multi master KeyDB cluster the MASTER <-> REPLICA sync loading in memory failing

Open dipanjang opened this issue 3 years ago • 1 comments

Crash report

Paste the complete crash log between the quotes below. Please include a few lines from the log preceding the crash report to provide some context.

=== KEYDB BUG REPORT START: Cut & paste starting from here ===
36757:36770:S 29 Jun 2022 20:20:36.338 # KeyDB 6.3.1 crashed by signal: 11, si_code: 1
36757:36770:S 29 Jun 2022 20:20:36.338 # Accessing address: 0xffffffffffffffff
36757:36770:S 29 Jun 2022 20:20:36.338 # Crashed running the instruction at: 0x513dd0

------ STACK TRACE ------
EIP:
src/keydb-server *:6451(sdscatlen+0x10) [0x513dd0]

Backtrace:
/lib64/libpthread.so.0(+0xf630) [0x7f7e4c54f630]
src/keydb-server *:6451(sdscatlen+0x10) [0x513dd0]
src/keydb-server *:6451(replicaReplayCommand(client*)+0x59c) [0x4e1c3c]
src/keydb-server *:6451(call(client*, int)+0xa1) [0x4974b1]
src/keydb-server *:6451(processCommand(client*, int)+0x904) [0x4985c4]
src/keydb-server *:6451(processCommandAndResetClient(client*, int)+0x65) [0x4cb2e5]
src/keydb-server *:6451(processInputBuffer(client*, bool, int)+0x174) [0x4d0ea4]
src/keydb-server *:6451(replicaReplayCommand(client*)+0x60c) [0x4e1cac]
src/keydb-server *:6451(call(client*, int)+0xa1) [0x4974b1]
src/keydb-server *:6451(processCommand(client*, int)+0x904) [0x4985c4]
src/keydb-server *:6451(processCommandAndResetClient(client*, int)+0x65) [0x4cb2e5]
src/keydb-server *:6451(processInputBuffer(client*, bool, int)+0x174) [0x4d0ea4]
src/keydb-server *:6451(processClients()+0xc9) [0x4d1029]
src/keydb-server *:6451() [0x490aa2]
src/keydb-server *:6451(beforeSleep(aeEventLoop*)+0x13e) [0x45789e]
src/keydb-server *:6451(aeProcessEvents+0xe8) [0x4509f8]
src/keydb-server *:6451(aeMain+0x37) [0x458f27]
src/keydb-server *:6451(workerThreadMain(void*)+0x73) [0x4915e3]
/lib64/libpthread.so.0(+0x7ea5) [0x7f7e4c547ea5]
/lib64/libc.so.6(clone+0x6d) [0x7f7e4c2709fd]

------ REGISTERS ------
36757:36770:S 29 Jun 2022 20:20:36.339 # 
RAX:0000000000000002 RBX:00007f7e2f481cc0
RCX:0000000000000000 RDX:0000000000000118
RDI:0000000000000000 RSI:00007f7e1b0b7005
RBP:00007f7e2f944700 RSP:00007f7e38bfbf40
R8 :0000000000000000 R9 :0000000000001000
R10:0000000000000010 R11:0000000000000000
R12:0000000000000118 R13:00007f7e1b0b7005
R14:0000000000000002 R15:00007f7e2f94899c
RIP:0000000000513dd0 EFL:0000000000010202
CSGSFS:0000000000000033
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf4f) -> 0000000000000013
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf4e) -> 00007f7e2f481cc0
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf4d) -> 00007f7e2f481d80
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf4c) -> 00007f7e2f481cc0
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf4b) -> 181aff1239200083
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf4a) -> 00007f7e36800f48
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf49) -> 00007f7e2f481d80
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf48) -> 00007f7e208bc101
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf47) -> 00007f7e36800f48
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf46) -> 00007f7e2f481cc0
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf45) -> 00000000004e1c3c
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf44) -> 0000000000000000
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf43) -> 00007f7e3687b0c8
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf42) -> 00007f7e2f944700
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf41) -> 00007f7e2f481cc0
36757:36770:S 29 Jun 2022 20:20:36.339 # (00007f7e38bfbf40) -> 00007f7e2f481cc0

------ INFO OUTPUT ------
# Server
redis_version:6.3.1
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:f73b1d3a7c06fcff
redis_mode:standalone
os:Linux 3.10.0-1160.31.1.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:7.3.1
process_id:36757
process_supervised:no
run_id:89e232dcde5794da2f171e980fd2b1b067e7e004
tcp_port:6451
server_time_usec:1656514236339024
uptime_in_seconds:949
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:12347068
executable:/npci/redis/KeyDB-6.3.1-6451/src/keydb-server
config_file:/npci/redis/KeyDB-6.3.1-6451/./keydb.conf

# Clients
connected_clients:22
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:14240
client_recent_max_output_buffer:167832265
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
current_client_thread:0
thread_0_clients:16
thread_1_clients:10

# Memory
used_memory:320811320
used_memory_human:305.95M
used_memory_rss:350347264
used_memory_rss_human:334.12M
used_memory_peak:341718944
used_memory_peak_human:325.89M
used_memory_peak_perc:93.88%
used_memory_overhead:790697093
used_memory_startup:2095232
used_memory_dataset:18446744073239665843
used_memory_dataset_perc:5787829665792.00%
allocator_allocated:321529736
allocator_active:336027648
allocator_resident:347021312
total_system_memory:33717776384
total_system_memory_human:31.40G
used_memory_lua:36864
used_memory_lua_human:36.00K
used_memory_scripts:264
used_memory_scripts_human:264B
number_of_cached_scripts:1
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.05
allocator_frag_bytes:14497912
allocator_rss_ratio:1.03
allocator_rss_bytes:10993664
rss_overhead_ratio:1.01
rss_overhead_bytes:3325952
mem_fragmentation_ratio:1.09
mem_fragmentation_bytes:29496824
mem_not_counted_for_evict:268435456
mem_replication_backlog:268435456
mem_clients_slaves:494489343
mem_clients_normal:451502
mem_aof_buffer:0
mem_allocator:jemalloc-5.2.1
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
storage_provider:none

# Persistence
loading:0
current_cow_size:358035456
current_cow_size_age:19
current_fork_perc:0.00
current_save_keys_processed:286721
current_save_keys_total:0
rdb_changes_since_last_save:436921
rdb_bgsave_in_progress:0
rdb_last_save_time:1656514217
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:1
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:358162432
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0

# Stats
total_connections_received:1893
total_commands_processed:42224543
instantaneous_ops_per_sec:66578
total_net_input_bytes:7802421274
total_net_output_bytes:8520561660
instantaneous_input_kbps:24833.63
instantaneous_output_kbps:67699.41
rejected_connections:0
sync_full:23
sync_partial_ok:0
sync_partial_err:20
expired_keys:61
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:11708
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:216051
total_error_replies:227693
dump_payload_sanitizations:0
total_reads_processed:1041707
total_writes_processed:642702
instantaneous_lock_contention:1
avg_lock_contention:0.375000
storage_provider_read_hits:0
storage_provider_read_misses:0

# Replication
role:active-replica
master_global_link_status:down
connected_masters:2
master_host:10.20.39.9
master_port:6451
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_read_repl_offset:6349164799
slave_repl_offset:6349164799
master_link_down_since_seconds:0
master_1_host:10.20.39.10
master_1_port:6451
master_1_link_status:down
master_1_last_io_seconds_ago:-1
master_1_sync_in_progress:0
slave_read_repl_offset:5543703933
slave_repl_offset:5543703933
master_1_link_down_since_seconds:0
master_2_host:10.21.18.200
master_2_port:6451
master_2_link_status:up
master_2_last_io_seconds_ago:0
master_2_sync_in_progress:0
slave_read_repl_offset:3800448233
slave_repl_offset:3800448137
master_3_host:10.21.18.201
master_3_port:6451
master_3_link_status:down
master_3_last_io_seconds_ago:-1
master_3_sync_in_progress:0
slave_read_repl_offset:-1
slave_repl_offset:-1
master_3_link_down_since_seconds:0
master_4_host:10.21.18.202
master_4_port:6451
master_4_link_status:up
master_4_last_io_seconds_ago:0
master_4_sync_in_progress:0
slave_read_repl_offset:3532549988
slave_repl_offset:3532534568
slave_priority:100
slave_read_only:0
replica_announced:1
connected_slaves:4
slave0:ip=10.20.39.10,port=6451,state=online,offset=5281844708,lag=2
slave1:ip=10.21.18.202,port=6451,state=online,offset=5270035428,lag=1
slave2:ip=10.21.18.200,port=6451,state=online,offset=5255882828,lag=2
slave3:ip=10.21.18.201,port=6451,state=online,offset=5330595941,lag=1
master_failover_state:no-failover
master_replid:e5e0f0df683a45c33030fdd497efc161831fb1f6
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:5473660805
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:268435456
repl_backlog_first_byte_offset:5205225350
repl_backlog_histlen:268435456

# CPU
used_cpu_sys:44.442939
used_cpu_user:265.858285
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
server_threads:2
long_lock_waits:365
used_cpu_sys_main_thread:31.280235
used_cpu_user_main_thread:193.669989

# Modules

# Commandstats
cmdstat_setex:calls=24902,usec=270999,usec_per_call=10.88,rejected_calls=733,failed_calls=0
cmdstat_replconf:calls=1856,usec=2286,usec_per_call=1.23,rejected_calls=59,failed_calls=0
cmdstat_get:calls=11708,usec=83448,usec_per_call=7.13,rejected_calls=0,failed_calls=0
cmdstat_rreplay:calls=33713728,usec=110971925,usec_per_call=3.29,rejected_calls=0,failed_calls=108025
cmdstat_replping:calls=0,usec=0,usec_per_call=0.00,rejected_calls=30,failed_calls=0
cmdstat_publish:calls=0,usec=0,usec_per_call=0.00,rejected_calls=1844,failed_calls=0
cmdstat_client:calls=0,usec=0,usec_per_call=0.00,rejected_calls=1831,failed_calls=0
cmdstat_psync:calls=23,usec=19123,usec_per_call=831.43,rejected_calls=7,failed_calls=0
cmdstat_multi:calls=99968,usec=18247,usec_per_call=0.18,rejected_calls=0,failed_calls=0
cmdstat_KEYDB.MVCCRESTORE:calls=8113128,usec=26850822,usec_per_call=3.31,rejected_calls=107833,failed_calls=0
cmdstat_set:calls=145174,usec=1109808,usec_per_call=7.64,rejected_calls=190,failed_calls=0
cmdstat_exec:calls=99968,usec=779190,usec_per_call=7.79,rejected_calls=0,failed_calls=23
cmdstat_ping:calls=477,usec=138,usec_per_call=0.29,rejected_calls=1634,failed_calls=0
cmdstat_info:calls=12,usec=3297,usec_per_call=274.75,rejected_calls=1844,failed_calls=0
cmdstat_evalsha:calls=11708,usec=554184,usec_per_call=47.33,rejected_calls=0,failed_calls=0
cmdstat_subscribe:calls=0,usec=0,usec_per_call=0.00,rejected_calls=1829,failed_calls=0
cmdstat_auth:calls=1891,usec=47454,usec_per_call=25.09,rejected_calls=0,failed_calls=1831

# Errorstats
errorstat_ERR:count=108005
errorstat_EXECABORT:count=23
errorstat_LOADING:count=108763
errorstat_NOAUTH:count=9071
errorstat_WRONGPASS:count=1831

# Cluster
cluster_enabled:0

# Keyspace
db0:keys=299978,expires=299978,avg_ttl=171625672,cached_keys=299978

# KeyDB
mvcc_depth:0

------ CLIENT LIST OUTPUT ------
id=8 addr=10.20.39.2:33574 laddr=10.20.39.8:6451 fd=22 name= age=948 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20512 events=r cmd=publish user=default redir=-1
id=40 addr=10.20.39.3:44324 laddr=10.20.39.8:6451 fd=29 name= age=939 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20520 events=r cmd=setex user=default redir=-1
id=42 addr=10.20.39.2:34066 laddr=10.20.39.8:6451 fd=30 name= age=939 idle=2 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20520 events=r cmd=setex user=default redir=-1
id=1936 addr=10.21.18.201:46399 laddr=10.20.39.8:6451 fd=82 name= age=19 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=121588681 tot-mem=121609193 events=rw cmd=replconf user=default redir=-1
id=1984 addr=10.21.18.202:6451 laddr=10.20.39.8:28070 fd=0 name= age=0 idle=0 flags=M db=0 sub=0 psub=0 multi=-1 qbuf=17 qbuf-free=40937 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61480 events=r cmd=rreplay user=(superuser) redir=-1
id=1964 addr=10.21.18.200:6451 laddr=10.20.39.8:24170 fd=84 name= age=9 idle=0 flags=M db=0 sub=0 psub=0 multi=-1 qbuf=30 qbuf-free=40924 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61480 events=r cmd=rreplay user=(superuser) redir=-1
id=43 addr=10.20.39.2:34068 laddr=10.20.39.8:6451 fd=31 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=44 addr=10.20.39.3:44330 laddr=10.20.39.8:6451 fd=32 name= age=939 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61472 events=r cmd=setex user=default redir=-1
id=45 addr=10.20.39.2:34072 laddr=10.20.39.8:6451 fd=34 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=46 addr=10.20.39.2:34074 laddr=10.20.39.8:6451 fd=35 name= age=939 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61472 events=r cmd=setex user=default redir=-1
id=47 addr=10.20.39.2:34071 laddr=10.20.39.8:6451 fd=36 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=48 addr=10.20.39.2:34070 laddr=10.20.39.8:6451 fd=33 name= age=939 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61472 events=r cmd=setex user=default redir=-1
id=49 addr=10.20.39.3:44332 laddr=10.20.39.8:6451 fd=39 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=50 addr=10.20.39.4:34158 laddr=10.20.39.8:6451 fd=43 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=51 addr=10.20.39.4:34166 laddr=10.20.39.8:6451 fd=37 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=52 addr=10.20.39.3:44334 laddr=10.20.39.8:6451 fd=44 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=53 addr=10.20.39.4:34162 laddr=10.20.39.8:6451 fd=45 name= age=939 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20520 events=r cmd=setex user=default redir=-1
id=54 addr=10.20.39.3:44336 laddr=10.20.39.8:6451 fd=38 name= age=939 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20520 events=r cmd=setex user=default redir=-1
id=55 addr=10.20.39.4:34164 laddr=10.20.39.8:6451 fd=40 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=56 addr=10.20.39.4:34159 laddr=10.20.39.8:6451 fd=41 name= age=939 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20520 events=r cmd=setex user=default redir=-1
id=1905 addr=10.21.18.202:11710 laddr=10.20.39.8:6451 fd=75 name= age=30 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=179482908 tot-mem=179544372 events=rw cmd=replconf user=default redir=-1
id=1857 addr=10.20.39.10:31084 laddr=10.20.39.8:6451 fd=74 name= age=49 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=186242028 tot-mem=186303492 events=rw cmd=replconf user=default redir=-1
id=57 addr=10.20.39.3:44338 laddr=10.20.39.8:6451 fd=42 name= age=939 idle=939 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=auth user=default redir=-1
id=58 addr=10.20.39.4:34168 laddr=10.20.39.8:6451 fd=46 name= age=939 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61472 events=r cmd=setex user=default redir=-1
id=1933 addr=10.21.18.200:43680 laddr=10.20.39.8:6451 fd=78 name= age=20 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=205733786 tot-mem=205795250 events=rw cmd=replconf user=default redir=-1
id=7 addr=10.20.39.4:33648 laddr=10.20.39.8:6451 fd=21 name= age=948 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=61464 events=r cmd=publish user=default redir=-1

------ CURRENT CLIENT INFO ------
id=1450 addr=?:0 laddr=?:0 fd=-1 name= age=227 idle=227 flags=M db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=634 argv-mem=343 obl=0 oll=0 omem=0 tot-mem=21503 events= cmd=rreplay user=(superuser) redir=-1
argv[0]: 'RREPLAY'
argv[1]: 'e75b77e5-54ca-4e6d-b5bd-17091a652356'
argv[2]: '*5
$7
RREPLAY
$36
b875070c-feb0-4968-b5c3-c9e3605813ea
$179
*5
$17
KEYDB.MVCCRESTORE
$46
UPI_TXN_ID_XXXqmqiJk0a4qTGGICAJ5S5LIoV52LsUZV7
$19
1736980452632166407
$13
1656686445774
$45
'
argv[3]: '0'
argv[4]: '1736981060037640323'

------ MODULES INFO OUTPUT ------

------ FAST MEMORY TEST ------
36757:36770:S 29 Jun 2022 20:20:36.340 # main thread terminated
36757:36770:S 29 Jun 2022 20:20:36.340 # Bio thread for job type #0 terminated
36757:36770:S 29 Jun 2022 20:20:36.340 # Bio thread for job type #1 terminated
36757:36770:S 29 Jun 2022 20:20:36.340 # Bio thread for job type #2 terminated

Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.

------ DUMPING CODE AROUND EIP ------
Symbol: sdscatlen (base: 0x513dc0)
Module: src/keydb-server *:6451 (base 0x400000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x513dc0 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
36757:36770:S 29 Jun 2022 20:20:36.340 # dump of function (hexdump of 144 bytes):
415541544989d455534989f54883ec080fb657ff89d083e0070f85b1000000c0ea030fb6da4c89e6e813efffff4885c04889c5743a488d3c184c89e24c89ee4c01e3e839c3f2ff0fb645ff83e0073c0577180fb6c0ff24c5c8ec5f000f1f40008d04dd000000008845ffc6441d00004883c4084889e85b5d415c415dc30f1f0048895defc6441d0000ebe40f1f440000
Function at 0x512d00 is sdsMakeRoomFor

Aditional information

  1. OS distribution and version
  2. Steps to reproduce (if any)

We have started three instance of KeyDB (data partitioned from application) in multi master setup with active replication. Loaded some data with application. Started similar set of instances with replicaof all the 6 instances (former three and new three). Loaded some data in these three instances also. The replication was fine and all the instances were running. Now changed the first three instances to connect all the 6 instances.

This time one node of the first set got crashed. We recreated this again with same steps. Any one of the nodes are failing.

MASTER <-> REPLICA sync: Loading DB in memory 36757:36770:S 29 Jun 2022 20:04:51.108 * Loading RDB produced by version 6.3.1 36757:36770:S 29 Jun 2022 20:04:51.108 * RDB age 2 seconds 36757:36770:S 29 Jun 2022 20:04:51.108 * RDB memory usage when created 32.61 Mb 36757:36770:S 29 Jun 2022 20:04:51.397 # == WARNING == This replica is rejecting a command from its master: '-LOADING KeyDB is loading the dataset in memory' after processing the command 'set' 36757:36770:S 29 Jun 2022 20:04:51.397 # Latest backlog is: '"345530-2ed0-4834-a34e-bbf16ceb5f8a\r\n$179\r\n5\r\n$17\r\nKEYDB.MVCCRESTORE\r\n$46\r\nUPI_TXN_ID_XXX7SOR8YId7G1RqllUOorvqDeTUB5CscWB\r\n$19\r\n1736979893517811715\r\n$13\r\n1656685912562\r\n$45\r\n\x00\x00\xfa\x0bmvcc-tstamp\x131736979893517811715\t\x00\x1a\b\xd3o7\\xaf\xb1\r\n\r\n$1\r\n0\r\n$19\r\n1736980081038852130\r\n"' 36757:36770:S 29 Jun 2022 20:04:51.397 # == CRITICAL == This replica is sending an error to its master: '-EXECABORT Transaction discarded because of previous errors.' after processing the command 'exec' 36757:36770:S 29 Jun 2022 20:04:51.397 # Latest backlog is: '"345530-2ed0-4834-a34e-bbf16ceb5f8a\r\n$179\r\n5\r\n$17\r\nKEYDB.MVCCRESTORE\r\n$46\r\nUPI_TXN_ID_XXX7SOR8YId7G1RqllUOorvqDeTUB5CscWB\r\n$19\r\n1736979893517811715\r\n$13\r\n1656685912562\r\n$45\r\n\x00\x00\xfa\x0bmvcc-tstamp\x131736979893517811715\t\x00\x1a\b\xd3o7\\xaf\xb1\r\n\r\n$1\r\n0\r\n$19\r\n1736980081038852130\r\n"'

dipanjang avatar Jun 30 '22 17:06 dipanjang

another point. for all disc persistence was off with save ""

dipanjang avatar Jun 30 '22 19:06 dipanjang

Is there anyone still experiencing this issue, will be helpful to understand how to prioritize this.

msotheeswaran-sc avatar Mar 10 '23 20:03 msotheeswaran-sc

Closing as there has been no response in 30 days

msotheeswaran-sc avatar Apr 12 '23 20:04 msotheeswaran-sc