cluster failover失败,提示replication unset master fail on node-ERR:16,msg:Lock wait timeout
Description
一组三主三从的cluster集群,集群单个分片存储数据约为20G。在从库执行cluster failover [force|takeover],主动故障转移会失败。 1)调大lockdbxwaittimeout参数,比如500,会切换成功,这个参数是跟数据存储量有关吗? 2)如果数据不到G级别,默认参数可以切换成功 3)即使调大lockdbxwaittimeout参数,在主从没有offset延迟情况下,整个切换时间是达到5min
Expected Behavior
主动故障转移快速切换成功
Current Behavior
主动故障转移失败,提示Manual failover timed out
Possible Solution
增大lockdbxwaittimeout该参数,会切换成功
Steps to Reproduce (for bugs)
Context
Your Environment
- Operating System and version:CentOS Linux release 7.3.1611 (Core)
- Machine Specifications: 16核CPU、64G内存、3.6T SSD
- Tendis Version: 2.3.4
- Tendis Configuration: lockdbxwaittimeout 180;lockwaittimeout 3600
- IO/Network used:
- Link to your project:
日志如下 E0615 18:40:12.742552 22832 cluster_manager.cpp:1530] Manual failover user request accepted. E0615 18:40:12.805608 22857 cluster_manager.cpp:5005] Received replication offset for paused master manual failover: 57021795 57021795 E0615 18:40:12.833029 22858 cluster_manager.cpp:2194] All master replication stream processed, manual failover can start. E0615 18:40:12.833102 22858 cluster_manager.cpp:2787] Start of election delayed for 0 milliseconds (rank #0, offset 57021795). E0615 18:40:12.933202 22858 cluster_manager.cpp:2174] mf_can_start is non-zero E0615 18:40:12.933250 22858 cluster_manager.cpp:2840] Starting a failover election for epoch 14. E0615 18:40:13.033845 22858 cluster_manager.cpp:2174] mf_can_start is non-zero E0615 18:40:13.033880 22858 cluster_manager.cpp:2855] Failover election won: I'm the new master. E0615 18:40:13.033890 22858 cluster_manager.cpp:2859] configEpoch set to 14 after successful failover E0615 18:43:13.034068 22858 cluster_manager.cpp:2655] replication unset master fail on node-ERR:16,msg:Lock wait timeout E0615 18:43:13.034097 22858 cluster_manager.cpp:4302] replace master fail-ERR:16,msg:Lock wait timeout E0615 18:43:13.037346 22833 server_entry.cpp:325] sessid:4085335 cmd:binlog_heartbeat 0 1623753614002, error:-ERR:6,msg:sessionId not match E0615 18:43:13.039944 22832 server_entry.cpp:325] sessid:4085338 cmd:binlog_heartbeat 4 1623753614002, error:-ERR:6,msg:sessionId not match E0615 18:43:13.040509 22830 server_entry.cpp:325] sessid:4085340 cmd:binlog_heartbeat 7 1623753614002, error:-ERR:6,msg:sessionId not match E0615 18:43:13.041263 22831 server_entry.cpp:325] sessid:4085342 cmd:binlog_heartbeat 8 1623753614002, error:-ERR:6,msg:sessionId not match E0615 18:43:13.041405 22833 repl.cpp:494] applyRepllog failed,mode:0 err: E0615 18:43:13.041426 22831 repl.cpp:494] applyRepllog failed,mode:0 err: E0615 18:43:13.041512 22831 server_entry.cpp:325] sessid:4085334 cmd:applybinlogsv2 1 [78856] 180 0, error:-ERR:6,msg:sessionId not match E0615 18:43:13.041530 22833 server_entry.cpp:325] sessid:4085343 cmd:applybinlogsv2 9 [106521] 235 0, error:-ERR:6,msg:sessionId not match E0615 18:43:13.041640 22833 repl.cpp:494] applyRepllog failed,mode:0 err: E0615 18:43:13.041667 22833 server_entry.cpp:325] sessid:4085341 cmd:applybinlogsv2 6 [94700] 212 0, error:-ERR:6,msg:sessionId not match E0615 18:43:13.041685 22831 repl.cpp:494] applyRepllog failed,mode:0 err: E0615 18:43:13.041800 22831 server_entry.cpp:325] sessid:4085336 cmd:applybinlogsv2 3 [108351] 241 0, error:-ERR:6,msg:sessionId not match E0615 18:43:13.041828 22833 repl.cpp:494] applyRepllog failed,mode:0 err: E0615 18:43:13.041867 22833 server_entry.cpp:325] sessid:4085337 cmd:applybinlogsv2 2 [116055] 256 0, error:-ERR:6,msg:sessionId not match E0615 18:43:13.041966 22830 repl.cpp:494] applyRepllog failed,mode:0 err: E0615 18:43:13.042052 22830 server_entry.cpp:325] sessid:4085339 cmd:applybinlogsv2 5 [115562] 256 0, error:-ERR:6,msg:sessionId not match E0615 18:43:13.134183 22858 cluster_manager.cpp:2127] Manual failover timed out.
你好
默认lockdbxwaittimeout为1,如果这个提示超时,表示slave在提升为主的时候,正在apply的binlog无法在1s内执行成功,因此,锁超时。当这个参数调整为500后,可以成功,说明apply时间足够了。
另外,这个超时可能说明这个时候slave apply一个比较大的binlog,或者slave响应很慢。
请提供一下你当前的配置参数,可以执行config get *获得以及info all
另外,提供一下,调整lockdbxwaittimeout后,切换成功slave的相关日志。
此时,master执行了一些什么操作,重点关注
- 是否存在大key
- 是否存在时间复杂度O(N)的操作,详看 http://tendis.cn/#/Tendisplus/%E6%95%B4%E4%BD%93%E4%BB%8B%E7%BB%8D/redis%E5%85%BC%E5%AE%B9%E6%80%A7
info all信息
# Server
redis_version:2.3.4-rocksdb-v5.13.4
redis_git_sha1:552a4365
redis_git_dirty:23
redis_build_id:4869811118804139172
redis_mode:cluster
TENDIS_DEBUG:OFF
os:Linux 3.10.0-514.21.1.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:asio
gcc_version:5:5:0
process_id:16293
tcp_port:10005
uptime_in_seconds:582532
uptime_in_days:6
config_file:/home/deploy/tendis/tendis_10005/tendis.conf
# Clients
connected_clients:23
# Memory
used_memory:-1
used_memory_human:-1
used_memory_rss:16907182080
used_memory_rss_human:16510920kB
used_memory_peak:-1
used_memory_peak_human:-1
total_system_memory:-1
total_system_memory_human:-1
used_memory_lua:-1
used_memory_vir:28040253440
used_memory_vir_human:27383060kB
used_memory_vir_peak_human:27383060kB
used_memory_rss_peak_human:17585584kB
# Persistence
loading:-1
rdb_changes_since_last_save:-1
rdb_bgsave_in_progress:-1
rdb_last_save_time:-1
rdb_last_bgsave_status:-1
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
aof_enabled:-1
aof_rewrite_in_progress:-1
aof_rewrite_scheduled:-1
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:-1
aof_last_write_status:-1
# Stats
total_connections_received:11677
total_connections_released:11666
total_commands_processed:81234099
instantaneous_ops_per_sec:9
total_commands_cost(ns):10819726677725
total_commands_workpool_queue_cost(ns):1334726646753
total_commands_workpool_execute_cost(ns):7684238724562
total_commands_send_packet_cost(ns):1800761306410
total_commands_execute_cost(ns):7189228875731
avg_commands_cost(ns):133191
avg_commands_workpool_queue_cost(ns):16430
avg_commands_workpool_execute_cost(ns):94593
avg_commands_send_packet_cost(ns):22167
avg_commands_execute_cost(ns):88500
commands_in_queue:1
commands_executed_in_workpool:164361732
total_stricky_packets:11259684
total_invalid_packets:0
total_net_input_bytes:31662599202
total_net_output_bytes:914338978
instantaneous_input_kbps:0.493164
instantaneous_output_kbps:0.705078
rejected_connections:0
sync_full:10
sync_partial_ok:120
sync_partial_err:0
keyspace_hits:32054000
keyspace_misses:28001674
keyspace_wrong_versionep:0
scheduleNum:11677
# Replication
role:slave
master_host:xxxxxxxx
master_port:xxxxxxxx
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:57265181
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:57265181
rocksdb0_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=0,state=online,fullsync_succ_times=1,binlog_pos=5719621,lag=0
rocksdb1_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=1,state=online,fullsync_succ_times=1,binlog_pos=5725145,lag=0
rocksdb2_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=2,state=online,fullsync_succ_times=1,binlog_pos=5745738,lag=0
rocksdb3_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=3,state=online,fullsync_succ_times=1,binlog_pos=5724495,lag=0
rocksdb4_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=4,state=online,fullsync_succ_times=1,binlog_pos=5733199,lag=0
rocksdb5_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=5,state=online,fullsync_succ_times=1,binlog_pos=5720659,lag=0
rocksdb6_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=6,state=online,fullsync_succ_times=1,binlog_pos=5729625,lag=0
rocksdb7_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=7,state=online,fullsync_succ_times=1,binlog_pos=5720845,lag=0
rocksdb8_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=8,state=online,fullsync_succ_times=1,binlog_pos=5725794,lag=0
rocksdb9_master:ip=xxxxxxxx,port=xxxxxxxx,src_store_id=9,state=online,fullsync_succ_times=1,binlog_pos=5720060,lag=0
# BinlogInfo
rocksdb0:min=5307091,save=5307091,BLWM=5719621,BHWM=5719622,remain=412530
rocksdb1:min=5204131,save=5204131,BLWM=5725145,BHWM=5725146,remain=521014
rocksdb2:min=5365631,save=5365631,BLWM=5745738,BHWM=5745739,remain=380107
rocksdb3:min=5305151,save=5305151,BLWM=5724495,BHWM=5724496,remain=419344
rocksdb4:min=5244361,save=5244361,BLWM=5733199,BHWM=5733200,remain=488838
rocksdb5:min=5366591,save=5366591,BLWM=5720659,BHWM=5720660,remain=354068
rocksdb6:min=5304261,save=5304261,BLWM=5729625,BHWM=5729626,remain=425364
rocksdb7:min=5367701,save=5367701,BLWM=5720845,BHWM=5720846,remain=353144
rocksdb8:min=5244501,save=5244501,BLWM=5725794,BHWM=5725795,remain=481293
rocksdb9:min=5284781,save=5284781,BLWM=5720060,BHWM=5720061,remain=435279
# CPU
used_cpu_sys:6057.07
used_cpu_user:10249.52
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
# CommandStats
cmdstat_applybinlogsv2:calls=164,usec=39423,usec_per_call=240.384
cmdstat_binlog_heartbeat:calls=1449,usec=8554,usec_per_call=5.90338
cmdstat_cluster:calls=69,usec=16326,usec_per_call=236.609
cmdstat_command:calls=11,usec=4982,usec_per_call=452.909
cmdstat_config:calls=10750,usec=770497,usec_per_call=71.6741
cmdstat_dbsize:calls=12,usec=352818711,usec_per_call=2.94016e+07
cmdstat_del:calls=21820216,usec=1140344503,usec_per_call=52.2609
cmdstat_exists:calls=2664720,usec=99660952,usec_per_call=37.4002
cmdstat_expire:calls=51690,usec=2554127,usec_per_call=49.4124
cmdstat_get:calls=1,usec=107,usec_per_call=107
cmdstat_hset:calls=37,usec=2897,usec_per_call=78.2973
cmdstat_info:calls=19999,usec=10637849,usec_per_call=531.919
cmdstat_lpush:calls=4317,usec=440760,usec_per_call=102.099
cmdstat_monitor:calls=1,usec=3,usec_per_call=3
cmdstat_pexpire:calls=21500189,usec=1301949446,usec_per_call=60.5553
cmdstat_ping:calls=12482,usec=17740,usec_per_call=1.42125
cmdstat_rpush:calls=12909449,usec=802126020,usec_per_call=62.1348
cmdstat_scan:calls=3,usec=4706,usec_per_call=1568.67
cmdstat_set:calls=20498065,usec=944278313,usec_per_call=46.0667
cmdstat_setex:calls=579524,usec=26125866,usec_per_call=45.0816
cmdstat_setnx:calls=47374,usec=3079128,usec_per_call=64.9962
cmdstat_show:calls=2,usec=16,usec_per_call=8
cmdstat_zadd:calls=1194779,usec=148183959,usec_per_call=124.026
cmdstat_unseen:calls=512,num=2
# Cluster
cluster_enabled:1
# Keyspace
db0:keys=0,expires=0,avg_ttl=0
# Backup
backup-count:0
last-backup-time:0
current-backup-running:no
# Dataset
rocksdb.kvstore-count:10
rocksdb.total-sst-files-size:12320440984
rocksdb.binlogcf-sst-files-size:9078363405
rocksdb.live-sst-files-size:12320440984
rocksdb.estimate-live-data-size:9010630971
rocksdb.estimate-num-keys:40121441
rocksdb.total-memory:13070006404
rocksdb.cur-size-all-mem-tables:342056
rocksdb.estimate-table-readers-mem:185287868
rocksdb.blockcache.capacity:12884901888
rocksdb.blockcache.usage:12884376480
rocksdb.blockcache.pinnedusage:1920
rocksdb.mem-table-flush-pending:0
rocksdb.estimate-pending-compaction-bytes:0
rocksdb.compaction-pending:0
rocksdb.number.iter.skip:400717717
rocksdb.compaction-filter-count:360041943
rocksdb.compaction-kv-expired-count:172118
# Compaction
current-compaction-status:stopped
time-since-lastest-compaction:582532
current-compaction-dbid:
# IndexManager
total_expire_keys:182584
deleting_expire_keys:0
scanner_matrix:inQueue 0,executing 0,executed 582500,queueTime 1340715191966454021ns,executeTime 52149849824846ns
deleter_matrix:inQueue 0,executing 0,executed 25541,queueTime 3037383436988209ns,executeTime 3032003897030ns
scanpoint_0:21-06-15 20:23:42
scanpoint_1:21-06-15 20:23:42
scanpoint_2:21-06-15 20:23:42
scanpoint_3:21-06-15 20:23:44
scanpoint_4:21-06-15 20:23:40
scanpoint_5:21-06-15 20:23:43
scanpoint_6:21-06-15 20:23:43
scanpoint_7:21-06-15 20:23:43
scanpoint_8:21-06-15 20:23:42
scanpoint_9:21-06-15 20:23:34
scanpoint:21-06-15 20:23:34
# Levelstats
rocksdb0.level-0:bytes=13845649,num_entries=141168,num_deletions=129179,num_files=2
rocksdb0.level-5:bytes=135335411,num_entries=228255,num_deletions=4929,num_files=2
rocksdb0.level-6:bytes=975138616,num_entries=3905163,num_deletions=0,num_files=15
rocksdb1.level-0:bytes=18160078,num_entries=148379,num_deletions=129408,num_files=3
rocksdb1.level-5:bytes=135350763,num_entries=229120,num_deletions=4828,num_files=2
rocksdb1.level-6:bytes=975317794,num_entries=3903197,num_deletions=0,num_files=15
rocksdb2.level-0:bytes=26430975,num_entries=163869,num_deletions=130920,num_files=3
rocksdb2.level-5:bytes=202999068,num_entries=341131,num_deletions=7297,num_files=3
rocksdb2.level-6:bytes=909533883,num_entries=3796710,num_deletions=0,num_files=16
rocksdb3.level-0:bytes=17152907,num_entries=147794,num_deletions=130493,num_files=3
rocksdb3.level-5:bytes=493903850,num_entries=831749,num_deletions=8882,num_files=8
rocksdb3.level-6:bytes=807583356,num_entries=3310355,num_deletions=0,num_files=14
rocksdb4.level-0:bytes=21373974,num_entries=154377,num_deletions=129769,num_files=3
rocksdb4.level-5:bytes=497189107,num_entries=838384,num_deletions=6011,num_files=8
rocksdb4.level-6:bytes=803401703,num_entries=3294505,num_deletions=0,num_files=13
rocksdb5.level-0:bytes=15214262,num_entries=144489,num_deletions=130213,num_files=2
rocksdb5.level-5:bytes=422903489,num_entries=715401,num_deletions=8976,num_files=7
rocksdb5.level-6:bytes=833049008,num_entries=3432333,num_deletions=0,num_files=14
rocksdb6.level-0:bytes=19199007,num_entries=148978,num_deletions=128348,num_files=3
rocksdb6.level-5:bytes=493442124,num_entries=830092,num_deletions=8180,num_files=8
rocksdb6.level-6:bytes=989111218,num_entries=3309175,num_deletions=0,num_files=16
rocksdb7.level-0:bytes=14109816,num_entries=142039,num_deletions=129474,num_files=2
rocksdb7.level-5:bytes=202993361,num_entries=342855,num_deletions=4868,num_files=3
rocksdb7.level-6:bytes=906918135,num_entries=3787475,num_deletions=0,num_files=15
rocksdb8.level-0:bytes=17306678,num_entries=146487,num_deletions=128859,num_files=3
rocksdb8.level-5:bytes=412784317,num_entries=692014,num_deletions=8667,num_files=7
rocksdb8.level-6:bytes=835937305,num_entries=3453181,num_deletions=0,num_files=14
rocksdb9.level-0:bytes=14777164,num_entries=143911,num_deletions=130277,num_files=2
rocksdb9.level-5:bytes=135338013,num_entries=229099,num_deletions=4825,num_files=2
rocksdb9.level-6:bytes=974639953,num_entries=3900993,num_deletions=0,num_files=16
# RocksdbBgError
rocksdb_bg_error_count:0
config 信息
1) "aof-enabled"
2) "no"
3) "aof-psync-num"
4) "500"
5) "bind"
6) "\"0.0.0.0\""
7) "binlog-send-batch"
8) "256"
9) "binlog-send-bytes"
10) "16777216"
11) "binlog-using-defaultcf"
12) "no"
13) "binlogdelrange"
14) "10"
15) "binlogfilesecs"
16) "7200"
17) "binlogfilesizemb"
18) "512"
19) "binlogratelimitmb"
20) "120"
21) "checkkeytypeforsetcmd"
22) "no"
23) "chunksize"
24) "16384"
25) "cluster-enabled"
26) "yes"
27) "cluster-migration-barrier"
28) "1"
29) "cluster-migration-binlog-iters"
30) "60"
31) "cluster-migration-distance"
32) "10000"
33) "cluster-migration-rate-limit"
34) "100"
35) "cluster-migration-slots-num-per-task"
36) "100"
37) "cluster-node-timeout"
38) "15000"
39) "cluster-require-full-coverage"
40) "no"
41) "cluster-single-node"
42) "no"
43) "cluster-slave-no-failover"
44) "no"
45) "cluster-slave-validity-factor"
46) "10"
47) "compactrange-after-deleterange"
48) "no"
49) "daemon"
50) "yes"
51) "databases"
52) "16"
53) "delcntindexmgr"
54) "10000"
55) "deletefilesinrange-for-binlog"
56) "no"
57) "deljobcntindexmgr"
58) "1"
59) "dir"
60) "\"/home/deploy/tendis/tendis_10005/db\""
61) "domain-enabled"
62) "no"
63) "dumpdir"
64) "\"/home/deploy/tendis/tendis_10005/dump\""
65) "executorthreadnum"
66) "4"
67) "executorworkpoolsize"
68) "4"
69) "force-recovery"
70) "0"
71) "fullpushthreadnum"
72) "4"
73) "fullreceivethreadnum"
74) "4"
75) "garbage-delete-size"
76) "30"
77) "garbagedeletethreadnum"
78) "1"
79) "generallog"
80) "yes"
81) "incrpushthreadnum"
82) "4"
83) "jeprof-auto-dump"
84) "yes"
85) "keysdefaultlimit"
86) "100"
87) "kvstorecount"
88) "10"
89) "lockdbxwaittimeout"
90) "500"
91) "lockwaittimeout"
92) "3600"
93) "logdir"
94) "\"/home/deploy/tendis/tendis_10005/log\""
95) "loglevel"
96) "\"debug\""
97) "logrecyclethreadnum"
98) "4"
99) "lua-time-limit"
100) "5000"
101) "luastatemaxidletime"
102) "3600000"
103) "masterauth"
104) "\"\""
105) "maxbinlogkeepnum"
106) "100"
107) "maxclients"
108) "10000"
109) "migrate-gc-enabled"
110) "yes"
111) "migrate-snapshot-key-num"
112) "100000"
113) "migrate-snapshot-retry-num"
114) "1000"
115) "migratereceivethreadnum"
116) "4"
117) "migratesenderthreadnum"
118) "4"
119) "minbinlogkeepsec"
120) "604800"
121) "netbatchsize"
122) "1048576"
123) "netbatchtimeoutsec"
124) "60"
125) "netiothreadnum"
126) "2"
127) "noexpire"
128) "no"
129) "pausetimeindexmgr"
130) "10"
131) "pidfile"
132) "\"/home/deploy/tendis/tendis_10005/tmp/tendisplus.pid\""
133) "port"
134) "10005"
135) "proto-max-bulk-len"
136) "536870912"
137) "requirepass"
138) "\"\""
139) "rocks.blockcache_num_shard_bits"
140) "6"
141) "rocks.blockcache_strict_capacity_limit"
142) "no"
143) "rocks.blockcachemb"
144) "12288"
145) "rocks.compress_type"
146) "\"snappy\""
147) "rocks.disable_wal"
148) "no"
149) "rocks.flush_log_at_trx_commit"
150) "no"
151) "rocks.level0_compress_enabled"
152) "no"
153) "rocks.level1_compress_enabled"
154) "no"
155) "rocks.wal_dir"
156) "\"/home/deploy/tendis/tendis_10005/wal\""
157) "save-min-binlogid"
158) "yes"
159) "scancntindexmgr"
160) "1000"
161) "scandefaultlimit"
162) "100"
163) "scandefaultmaxiteratetimes"
164) "1000"
165) "scanjobcntindexmgr"
166) "1"
167) "slave-migrate-enabled"
168) "no"
169) "slavebinlogkeepnum"
170) "1"
171) "slowlog"
172) "\"/home/deploy/tendis/tendis_10005/log\""
173) "slowlog-file-enabled"
174) "yes"
175) "slowlog-flush-interval"
176) "1000"
177) "slowlog-log-slower-than"
178) "10000"
179) "slowlog-max-len"
180) "10240"
181) "storage"
182) "\"rocks\""
183) "timeoutsecbinlogwaitrsp"
184) "60"
185) "truncatebinlogintervalms"
186) "2000"
187) "truncatebinlognum"
188) "20000"
189) "version-increase"
190) "yes"
slave切换成功日志
E0615 20:00:43.207991 22832 cluster_manager.cpp:1523] Forced failover user request accepted. E0615 20:00:43.239794 22858 cluster_manager.cpp:2174] mf_can_start is non-zero E0615 20:00:43.239850 22858 cluster_manager.cpp:2787] Start of election delayed for 0 milliseconds (rank #0, offset 57069173). E0615 20:00:43.339946 22858 cluster_manager.cpp:2174] mf_can_start is non-zero E0615 20:00:43.339990 22858 cluster_manager.cpp:2840] Starting a failover election for epoch 20. E0615 20:00:43.440798 22858 cluster_manager.cpp:2174] mf_can_start is non-zero E0615 20:00:43.440837 22858 cluster_manager.cpp:2855] Failover election won: I'm the new master. E0615 20:00:43.440848 22858 cluster_manager.cpp:2859] configEpoch set to 20 after successful failover E0615 20:05:36.614210 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:0 node:xxxxxxxx state:send_bulk_success binlogPos:5700022 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614284 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:1 node:xxxxxxxx state:send_bulk_success binlogPos:5705346 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614292 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:2 node:xxxxxxxx state:send_bulk_success binlogPos:5725777 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614300 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:3 node:xxxxxxxx state:send_bulk_success binlogPos:5704664 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614306 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:4 node:xxxxxxxx state:send_bulk_success binlogPos:5713705 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614312 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:5 node:xxxxxxxx state:send_bulk_success binlogPos:5700703 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614320 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:6 node:xxxxxxxx state:send_bulk_success binlogPos:5710146 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614326 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:7 node:xxxxxxxx state:send_bulk_success binlogPos:5700914 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614332 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:8 node:xxxxxxxx state:send_bulk_success binlogPos:5706131 startTime:2971653275 endTime:2971653275 E0615 20:05:36.614339 22855 repl_manager.cpp:593] timeout, _fullPushStatus erase,storeId:9 node:xxxxxxxx state:send_bulk_success binlogPos:5700309 startTime:2971653275 endTime:2971653275 E0615 20:30:27.380627 22857 cluster_manager.cpp:2502] Failover auth granted to aa9eae11adefd68847b12d4bd0da89d752019198 for epoch 22
master操作
1、没有大key,测试数据以string类型为主,key数目较多 2、仅测试故障转移,主要操作是ping,cluster nodes,info replication 3、没有写操作,故障转移时,主从offset 是一致的