pika icon indicating copy to clipboard operation
pika copied to clipboard

主从服务器突然不同步了, 但是长时间不恢复, 要重启, 才可以

Open epubreader opened this issue 5 years ago • 6 comments

主从服务器不知道什么原因突然不同步了, 但是长时间不恢复, 要重启, 才可以 主服务器日志如下:

| I0623 18:01:52.907841    22 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 71, ip_port: 10.0.1.150:35924
| I0623 18:02:02.309511     1 pika_server.cc:273] Goodbye...
| I0623 18:01:56.026744     1 pika_dispatch_thread.cc:27] dispatch thread 140276433671936 exit!!!
| I0623 18:01:56.036780     1 pika_server.cc:113] Delete slave success
| I0623 18:02:08.311497     1 pika_dispatch_thread.cc:27] dispatch thread 139996992329472 exit!!!
| I0623 18:01:56.997107     1 pika_auxiliary_thread.cc:17] PikaAuxiliary thread 140276416886528 exit!!!
| I0623 18:02:08.314826     1 pika_auxiliary_thread.cc:17] PikaAuxiliary thread 139996975544064 exit!!!
| sh: line 0: kill: -: arguments must be process or job IDs
| sh: line 0: kill: -: arguments must be process or job IDs
| I0623 18:02:08.319499     1 pika_rsync_service.cc:35] PikaRsyncService exit!!!
| I0623 18:02:08.319536     1 pika_monitor_thread.cc:28] PikaMonitorThread 139998511958528 exit!!!
| I0623 18:02:08.323971     1 pika_server.cc:132] PikaServer 139998511958528 exit!!!
| I0623 18:02:08.324070     1 pika_repl_client.cc:38] PikaReplClient exit!!!
| I0623 18:02:08.324097     1 pika_repl_server.cc:31] PikaReplServer exit!!!
| path : /pika/conf/pika.conf
| -----------Pika server 3.2.9 ----------
| -----------Pika config list----------
|  1 port 9221
|  2 thread-num 5
|  3 thread-pool-size 12
|  4 sync-thread-num 6
|  5 log-path ./log/
|  6 db-path ./db/
|  7 write-buffer-size 268435456
|  8 timeout 60
|  9 requirepass 1111
| 10 masterauth 1111
| 11 userpass 2222
| 12 userblacklist
| 13 instance-mode classic
| 14 databases 1
| 15 default-slot-num 1024
| 16 dump-prefix
| 17 dump-path ./dump/
| 18 dump-expire 0
| 19 pidfile ./pika.pid
| 20 maxclients 20000
| 21 target-file-size-base 20971520
| 22 expire-logs-days 7
| 23 expire-logs-nums 10
| 24 root-connection-num 2
| 25 slowlog-write-errorlog no
| 26 slowlog-log-slower-than 10000
| 27 slowlog-max-len 128
| 28 db-sync-path ./dbsync/
| 29 db-sync-speed -1
| 30 slave-priority 100
| 31 server-id 1
| 32 sync-window-size 9000
| 33 max-conn-rbuf-size 268435456
| 34 write-binlog yes
| 35 binlog-file-size 104857600
| 36 max-cache-statistic-keys 0
| 37 small-compaction-threshold 5000
| 38 max-write-buffer-size 10737418240
| 39 max-client-response-size 1073741824
| 40 compression snappy
| 41 max-background-flushes 1
| 42 max-background-compactions 2
| 43 max-cache-files 5000
| 44 max-bytes-for-level-multiplier 10
| -----------Pika config end----------
| I0623 18:01:57.233522     1 pika_rsync_service.cc:35] PikaRsyncService exit!!!
| I0623 18:01:57.233603     1 pika_monitor_thread.cc:28] PikaMonitorThread 140278029359616 exit!!!
| I0623 18:01:57.632009     1 pika_server.cc:132] PikaServer 140278029359616 exit!!!
| I0623 18:01:57.638080     1 pika_repl_client.cc:38] PikaReplClient exit!!!
| I0623 18:01:57.638273     1 pika_repl_server.cc:31] PikaReplServer exit!!!
| path : /pika/conf/pika.conf
| -----------Pika server 3.2.9 ----------
| -----------Pika config list----------
|  1 port 9221
|  2 thread-num 5
|  3 thread-pool-size 12
|  4 sync-thread-num 6
|  5 log-path ./log/
|  6 db-path ./db/
|  7 write-buffer-size 268435456
|  8 timeout 60
|  9 requirepass 1111
| 10 masterauth 1111
| 11 userpass 2222
| 12 userblacklist
| 13 instance-mode classic
| 14 databases 1
| 15 default-slot-num 1024
| 16 dump-prefix
| 17 dump-path ./dump/
| 18 dump-expire 0
| 19 pidfile ./pika.pid
| 20 maxclients 20000
| 21 target-file-size-base 20971520
| 22 expire-logs-days 7
| 23 expire-logs-nums 10
| 24 root-connection-num 2
| 25 slowlog-write-errorlog no
| 26 slowlog-log-slower-than 10000
| 27 slowlog-max-len 128
| 28 db-sync-path ./dbsync/
| 29 db-sync-speed -1
| 30 slave-priority 100
| 31 server-id 1
| 32 sync-window-size 9000
| 33 max-conn-rbuf-size 268435456
| 34 write-binlog yes
| 35 binlog-file-size 104857600
| 36 max-cache-statistic-keys 0
| 37 small-compaction-threshold 5000
| 38 max-write-buffer-size 10737418240
| 39 max-client-response-size 1073741824
| 40 compression snappy
| 41 max-background-flushes 1
| 42 max-background-compactions 2
| 43 max-cache-files 5000
| 44 max-bytes-for-level-multiplier 10
| -----------Pika config end----------

从服务器日志:

| 38 small-compaction-threshold 5000
| I0528 03:09:56.015735    50 pika_partition.cc:617] db0 Success purge 1
| 39 max-write-buffer-size 10737418240
| 40 max-client-response-size 1073741824
| I0530 03:40:00.199074     6 pika_repl_client_thread.cc:21] ReplClient Close conn, fd=92, ip_port=pika:11221
| 41 compression snappy
| W0530 03:40:00.199463     6 pika_repl_client_thread.cc:31] Master conn disconnect : pika:11221 try reconnect
| 42 max-background-flushes 1
| W0530 03:40:00.217085    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| 43 max-background-compactions 2
| 44 max-cache-files 5000
| W0530 03:40:03.321048    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:06.423389    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| 45 max-bytes-for-level-multiplier 10
| -----------Pika config end----------
| W0530 03:40:09.526119    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:12.628903    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:15.731853    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:18.834913    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:21.937832    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:25.039788    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:28.142222    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:31.245103    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:34.348099    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| W0530 03:40:37.449506    49 pika_repl_client.cc:114] Failed to connect master, Master (pika:9221), try reconnect
| I0530 03:40:40.551244    49 pika_repl_client.cc:145] Try Send Meta Sync Request to Master (pika:9221)
| I0530 03:40:40.555188     9 pika_server.cc:543] Mark try connect finish
| I0530 03:40:40.555253     9 pika_repl_client_conn.cc:139] Finish to handle meta sync response
| I0530 03:40:40.652730    10 pika_repl_client_conn.cc:220] Partition: db0 TrySync Ok
| I0531 06:51:09.089020    50 pika_partition.cc:617] db0 Success purge 1
| I0602 18:29:21.031162     6 pika_repl_client_thread.cc:37] ReplClient Timeout conn, fd=67, ip_port=pika:11221
| W0602 18:29:21.031205     6 pika_repl_client_thread.cc:48] Master conn timeout : pika:11221 try reconnect
| I0602 18:29:21.122483    49 pika_repl_client.cc:145] Try Send Meta Sync Request to Master (pika:9221)
| W0602 18:29:21.125631    11 pika_repl_client_conn.cc:100] Meta Sync Failed: Slave AlreadyExist
| W0602 18:29:21.125690    11 pika_server.cc:761] Sync error, set repl_state to PIKA_REPL_ERROR
| I0602 18:29:21.125988     6 pika_repl_client_thread.cc:21] ReplClient Close conn, fd=67, ip_port=pika:11221
| I0603 00:03:18.296692     1 pika.cc:98] Catch Signal 15, cleanup...
| I0603 00:03:18.296782     1 pika_server.cc:273] Goodbye...
| I0603 00:03:24.651252     1 pika_dispatch_thread.cc:27] dispatch thread 139759930283776 exit!!!
| I0603 00:03:25.453156     1 pika_auxiliary_thread.cc:17] PikaAuxiliary thread 139759703811840 exit!!!
| sh: line 0: kill: -: arguments must be process or job IDs
| I0603 00:03:25.480860     1 pika_rsync_service.cc:35] PikaRsyncService exit!!!
| I0603 00:03:25.481288     1 pika_monitor_thread.cc:28] PikaMonitorThread 139761466087936 exit!!!
| I0603 00:03:25.509163     1 pika_server.cc:132] PikaServer 139761466087936 exit!!!
| I0603 00:03:25.509490     1 pika_repl_client.cc:38] PikaReplClient exit!!!
| I0603 00:03:25.509539     1 pika_repl_server.cc:31] PikaReplServer exit!!!

epubreader avatar Jun 29 '20 12:06 epubreader

主有0602 18:29左右的日志吗?

kernelai avatar Jun 30 '20 02:06 kernelai

今天又遇到了, 服务器版本为pikadb/pika:v3.2.9

从服务器日志 vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | I1103 15:05:48.621208 50 pika_partition.cc:617] db0 Success purge 1 vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | I1106 10:49:08.022241 50 pika_partition.cc:617] db0 Success purge 1 vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | I1110 02:16:35.936123 50 pika_partition.cc:617] db0 Success purge 1 vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | I1114 05:12:28.891971 50 pika_partition.cc:617] db0 Success purge 1 vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | I1118 01:19:40.028442 6 pika_repl_client_thread.cc:37] ReplClient Timeout conn, fd=364, ip_port=pika:11221 vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | W1118 01:19:40.028730 6 pika_repl_client_thread.cc:48] Master conn timeout : pika:11221 try reconnect vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | I1118 01:19:40.105703 49 pika_repl_client.cc:145] Try Send Meta Sync Request to Master (pika:9221) vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | W1118 01:19:40.108279 11 pika_repl_client_conn.cc:100] Meta Sync Failed: Slave AlreadyExist vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | W1118 01:19:40.108314 11 pika_server.cc:761] Sync error, set repl_state to PIKA_REPL_ERROR vudswggb2xcp@Ubuntu-1910-eoan-64-minimal | I1118 01:19:40.108405 6 pika_repl_client_thread.cc:21] ReplClient Close conn, fd=364, ip_port=pika:11221 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 37 max-cache-statistic-keys 0 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 38 small-compaction-threshold 5000 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 39 max-write-buffer-size 10737418240 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 40 max-client-response-size 1073741824 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 41 compression snappy cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 42 max-background-flushes 1 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 43 max-background-compactions 2 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 44 max-cache-files 5000 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | 45 max-bytes-for-level-multiplier 10 cx30kyrklc14@Ubuntu-1910-eoan-64-minimal | -----------Pika config end----------

主服务器日志:

I0901 04:10:02.364224 1 pika.cc:187] Server at: /pika/conf/pika.conf I0901 04:10:02.372625 1 pika_server.cc:167] Using Networker Interface: eth2 I0901 04:10:02.373453 1 pika_server.cc:210] host: 172.21.0.3 port: 9221 I0901 04:10:02.373473 1 pika_server.cc:87] Worker queue limit is 4100 I0901 04:10:07.839260 1 pika_partition.cc:92] db0 DB Success I0901 04:10:07.839316 1 pika_binlog.cc:106] Binlog: Find the exist file. I0901 04:10:07.840185 1 pika_server.cc:264] Pika Server going to start I0901 04:10:08.240557 20 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.1.106, Slave port:9222 I0901 04:10:08.240618 20 pika_server.cc:745] Add New Slave, 10.0.1.106:9222 I0901 04:10:08.340098 21 pika_repl_server_conn.cc:109] Receive Trysync, Slave ip: 10.0.1.106, Slave port:9222, Partition: db0, filenum: 700, pro_offset: 27570382 I0901 04:10:08.340250 21 pika_rm.cc:163] Add Slave Node, partition: (db0:0), ip_port: 10.0.1.106:9222 I0901 04:10:08.340271 21 pika_repl_server_conn.cc:175] Partition: db0 TrySync Success, Session: 0 I0902 15:24:30.369750 50 pika_partition.cc:617] db0 Success purge 1 I0905 09:55:44.243384 50 pika_partition.cc:617] db0 Success purge 1 I0909 01:49:26.009101 50 pika_partition.cc:617] db0 Success purge 1 I0913 03:19:02.033335 50 pika_partition.cc:617] db0 Success purge 1 I0914 18:31:28.627552 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.2:64764 I0914 18:36:22.788204 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 835, ip_port: 10.0.0.136:65218 I0915 00:08:44.055868 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.2:64245 I0915 00:13:41.264639 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 835, ip_port: 10.0.0.136:64645 I0915 05:33:52.884670 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.2:65425 I0915 05:38:44.045670 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.136:64637 I0916 04:24:18.839614 50 pika_partition.cc:617] db0 Success purge 1 I0919 00:47:33.991093 50 pika_partition.cc:617] db0 Success purge 1 I0920 13:47:14.317665 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.136:61104 I0921 06:43:36.434998 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.136:63803 I0922 04:08:51.768266 50 pika_partition.cc:617] db0 Success purge 1 I0922 13:44:23.716714 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.2:64282 I0922 14:15:03.829587 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.136:63064 I0923 10:56:43.835459 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.2:64075 I0923 11:27:29.668740 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.136:63656 I0924 05:37:49.914567 50 pika_partition.cc:617] db0 Success purge 1 I0924 09:44:42.913715 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:63327 I0924 09:44:42.914286 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 849, ip_port: 10.0.0.62:64958 I0924 09:44:51.920141 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 850, ip_port: 10.0.0.62:64231 I0924 09:44:51.920186 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 851, ip_port: 10.0.0.62:64814 I0924 09:44:54.921677 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 852, ip_port: 10.0.0.62:64170 I0924 10:31:59.685631 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.136:65463 I0924 10:55:42.596652 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 89, ip_port: 10.0.0.2:65021 I0927 01:21:04.879813 50 pika_partition.cc:617] db0 Success purge 1 I1001 00:36:49.808722 50 pika_partition.cc:617] db0 Success purge 1 I1002 07:49:59.762763 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.62:63398 I1002 07:54:29.941581 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.62:63187 I1002 07:54:32.944775 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 850, ip_port: 10.0.0.62:63019 I1002 08:05:39.383591 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.62:63967 I1002 09:22:06.217175 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.62:63524 I1002 09:26:27.372673 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.62:62665 I1002 09:26:30.374841 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 850, ip_port: 10.0.0.62:61438 I1002 09:37:33.785661 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.62:62389 I1002 12:45:33.122915 50 pika_partition.cc:617] db0 Success purge 1 I1005 01:30:45.272097 50 pika_partition.cc:617] db0 Success purge 1 I1006 19:54:09.535692 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.136:62629 I1008 00:54:01.476917 50 pika_partition.cc:617] db0 Success purge 1 I1008 14:27:19.351547 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:65241 I1008 14:34:07.620779 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:63139 I1009 07:14:20.206580 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:63634 I1009 07:20:17.413683 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:64393 I1010 10:05:02.469635 50 pika_partition.cc:617] db0 Success purge 1 I1014 05:36:45.834579 50 pika_partition.cc:617] db0 Success purge 1 I1016 15:07:46.814539 50 pika_partition.cc:617] db0 Success purge 1 I1019 02:42:48.491452 50 pika_partition.cc:617] db0 Success purge 1 I1022 01:32:14.208393 50 pika_partition.cc:617] db0 Success purge 1 I1024 07:50:27.507771 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:60005 I1025 03:37:31.192142 50 pika_partition.cc:617] db0 Success purge 1 I1028 23:21:14.693290 50 pika_partition.cc:617] db0 Success purge 1 I1101 00:46:31.760932 50 pika_partition.cc:617] db0 Success purge 1 I1101 21:15:52.458124 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.136:48995 I1101 21:17:35.445365 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.136:49325 I1103 15:05:44.642551 50 pika_partition.cc:617] db0 Success purge 1 I1105 00:09:44.079906 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.2:63669 I1105 13:19:51.560529 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 867, ip_port: 10.0.0.62:57476 I1105 13:19:51.564136 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.62:57469 I1105 13:19:51.573916 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 866, ip_port: 10.0.0.62:57473 I1105 13:19:52.558544 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 868, ip_port: 10.0.0.62:57472 I1105 13:20:00.543669 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 869, ip_port: 10.0.0.62:57471 I1105 16:31:18.228132 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 834, ip_port: 10.0.0.2:55384 I1105 16:31:18.275168 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 866, ip_port: 10.0.0.136:55433 I1106 10:48:59.522523 50 pika_partition.cc:617] db0 Success purge 1 I1109 19:17:42.218080 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.2:31382 I1109 19:19:25.149984 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.2:32611 I1110 02:16:41.661084 50 pika_partition.cc:617] db0 Success purge 1 I1113 07:35:43.929067 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:64987 I1113 07:53:21.577018 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:64327 I1114 05:12:28.130618 50 pika_partition.cc:617] db0 Success purge 1 I1114 07:04:57.056501 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:55328 I1114 07:04:59.079994 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 867, ip_port: 10.0.0.62:47482 I1114 07:05:02.971405 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 789, ip_port: 10.0.0.62:58176 W1118 01:18:48.027034 49 pika_rm.cc:530] (db0:0) Master del Recv Timeout slave success 10.0.1.106:9222 I1118 01:19:40.107879 20 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.1.106, Slave port:9222 W1118 01:19:40.108070 20 pika_server.cc:738] Slave Already Exist, ip_port: 10.0.1.106:9222 I1118 01:19:40.108561 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 868, ip_port: 10.0.1.150:47474 I1118 01:21:31.269938 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 70, ip_port: 10.0.1.150:55004 I1118 01:21:31.270294 19 pika_server.cc:649] Delete Slave Success, ip_port: 10.0.1.106:9222 I1118 02:42:02.538383 50 pika_partition.cc:617] db0 Success purge 1 I1119 10:07:39.898283 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 866, ip_port: 10.0.0.2:45025 I1119 10:07:40.271010 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 868, ip_port: 10.0.0.136:47559 I1119 15:42:27.444428 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 866, ip_port: 10.0.0.136:50918 I1119 17:08:45.986027 19 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 866, ip_port: 10.0.0.2:50624 I1119 22:17:58.105327 22 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.1.244, Slave port:9222 I1119 22:17:58.105684 22 pika_server.cc:745] Add New Slave, 10.0.1.244:9222 I1119 22:17:58.201710 21 pika_repl_server_conn.cc:109] Receive Trysync, Slave ip: 10.0.1.244, Slave port:9222, Partition: db0, filenum: 725, pro_offset: 87694191 I1119 22:17:58.204950 21 pika_rm.cc:163] Add Slave Node, partition: (db0:0), ip_port: 10.0.1.244:9222 I1119 22:17:58.204969 21 pika_repl_server_conn.cc:175] Partition: db0 TrySync Success, Session: 1 I1119 22:26:59.705724 1 pika.cc:98] Catch Signal 15, cleanup... I1119 22:26:59.706132 1 pika_server.cc:273] Goodbye...

epubreader avatar Nov 19 '20 22:11 epubreader

这个问题偶尔发生一次, 重启slave服务器就可以了, 是不是缓存了master服务器的一些信息, 才导致这个问题? 从服务器中配置master服务器为pika:11221, 用的不是IP地址。 我现在升级到pikadb/pika:v3.3.6, 看能不能解决这个问题?

epubreader avatar Nov 19 '20 22:11 epubreader

主有0602 18:29左右的日志吗?

能不能把这个bug重新打开? 我猜测原因是master服务器有保存了slave服务器的IP地址, 但是docker swarm集群的时候,会改变slave的IP地址,造成slave之后不能同步主服务器

epubreader avatar Nov 20 '20 00:11 epubreader

优先升级到v3.3.6吧。从日志看是在网络不稳定的情况下,slave断线重连的时候返现主节点里关于slave的信息没有清除,因此slave把状态置为error。v3.3.6已经修复了这个问题。

kernelai avatar Nov 20 '20 02:11 kernelai

我们目前使用 v3.3.6,在两台服务器直接做了3组pika主从(意思是网络环境是一样的),昨晚发生网络抖动,其中2组主从服务器很快就恢复了,但有1组也遇到这个「主节点里关于slave的信息没有清除」的问题,造成slave连接不上,需要手动修复。

不知道能从日志中看出些端倪吗? @kernelai

主节点日志:

path : conf/pika.conf
-----------Pika server----------
pika_version: 3.3.6
pika_git_sha:9e74c8cd0040a0a63c35e9d426c7d3b6464b378e
pika_build_compile_date: Dec  4 2020
...
...
...
W0624 22:21:58.540555    58 pika_rm.cc:407] (db0:0) Master del Recv Timeout slave success 10.0.0.2:9221
I0624 22:21:58.573012    23 pika_repl_server_conn.cc:108] Receive Trysync, Slave ip: 10.0.0.2, Slave port:9221, Partition: db0, filenum: 0, pro_offset: 82948733
I0624 22:21:58.573614    23 pika_rm.cc:79] Add Slave Node, partition: (db0:0), ip_port: 10.0.0.2:9221
I0624 22:21:58.573644    23 pika_repl_server_conn.cc:181] Partition: db0 TrySync Success, Session: 17
W0624 22:22:18.653208    58 pika_rm.cc:407] (db0:0) Master del Recv Timeout slave success 10.0.0.2:9221
I0624 22:23:00.986393    20 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 50, ip_port: 10.0.0.2:18217
I0624 22:23:00.986515    20 pika_server.cc:740] Delete Slave Success, ip_port: 10.0.0.2:9221
I0624 22:23:00.986547    20 pika_rm.cc:90] Remove Slave Node, Partition: (db0:0), ip_port: 10.0.0.2:9221
I0624 22:23:21.231633    21 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
I0624 22:23:21.231693    21 pika_server.cc:843] Add New Slave, 10.0.0.2:9221
I0624 22:23:42.002051    22 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
I0624 22:23:42.002051    23 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
W0624 22:23:42.002104    22 pika_server.cc:836] Slave Already Exist, ip_port: 10.0.0.2:9221
W0624 22:23:42.002159    23 pika_server.cc:836] Slave Already Exist, ip_port: 10.0.0.2:9221
I0624 22:24:03.504433    21 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
W0624 22:24:03.504493    21 pika_server.cc:836] Slave Already Exist, ip_port: 10.0.0.2:9221
I0624 22:24:32.689980    22 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
W0624 22:24:32.690040    22 pika_server.cc:836] Slave Already Exist, ip_port: 10.0.0.2:9221
I0624 22:24:53.553961    23 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
I0624 22:24:53.553970    21 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
W0624 22:24:53.554008    23 pika_server.cc:836] Slave Already Exist, ip_port: 10.0.0.2:9221
W0624 22:24:53.554059    21 pika_server.cc:836] Slave Already Exist, ip_port: 10.0.0.2:9221
I0624 22:24:56.139577    22 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
W0624 22:24:56.139647    22 pika_server.cc:836] Slave Already Exist, ip_port: 10.0.0.2:9221
I0624 22:25:06.146618    23 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.0.0.2, Slave port:9221
W0624 22:25:06.146677    23 pika_server.cc:836] Slave Already Exist, ip_port: 10.0.0.2:9221

从节点日志:

path : conf/pika.conf
-----------Pika server----------
pika_version: 3.3.6
pika_git_sha:9e74c8cd0040a0a63c35e9d426c7d3b6464b378e
pika_build_compile_date: Dec  4 2020
...
...
...
I0624 22:23:00.981081     7 pika_repl_client_thread.cc:38] ReplClient Timeout conn, fd=51, ip_port=10.0.0.1:11221
W0624 22:23:00.981707     7 pika_repl_client_thread.cc:49] Master conn timeout : 10.0.0.1:11221 try reconnect
I0624 22:23:10.014001    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
W0624 22:23:21.521927    58 pika_repl_client.cc:115] Failed to connect master, Master (10.0.0.1:9221), try reconnect
W0624 22:23:23.548414    14 pika_rm.cc:989] Failed to connect remote node(10.0.0.1:9221)
W0624 22:23:23.548641    14 pika_server.cc:612] Corruption: connect remote node error
I0624 22:23:23.548672    14 pika_server.cc:618] Mark try connect finish
I0624 22:23:23.548718    14 pika_repl_client_conn.cc:146] Finish to handle meta sync response
I0624 22:23:24.626231    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
I0624 22:23:35.037050    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
W0624 22:23:43.450937    15 pika_repl_client_conn.cc:101] Meta Sync Failed: Slave AlreadyExist will keep sending MetaSync msg
W0624 22:23:43.450942    16 pika_repl_client_conn.cc:101] Meta Sync Failed: Slave AlreadyExist will keep sending MetaSync msg
W0624 22:23:46.545130    58 pika_repl_client.cc:115] Failed to connect master, Master (10.0.0.1:9221), try reconnect
W0624 22:23:51.146888    58 pika_repl_client.cc:115] Failed to connect master, Master (10.0.0.1:9221), try reconnect
W0624 22:23:55.748637    58 pika_repl_client.cc:115] Failed to connect master, Master (10.0.0.1:9221), try reconnect
W0624 22:24:00.350386    58 pika_repl_client.cc:115] Failed to connect master, Master (10.0.0.1:9221), try reconnect
I0624 22:24:03.454694    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
W0624 22:24:03.502259    17 pika_repl_client_conn.cc:101] Meta Sync Failed: Slave AlreadyExist will keep sending MetaSync msg
W0624 22:24:14.562423    58 pika_repl_client.cc:115] Failed to connect master, Master (10.0.0.1:9221), try reconnect
I0624 22:24:17.666697    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
W0624 22:24:28.574239    58 pika_repl_client.cc:115] Failed to connect master, Master (10.0.0.1:9221), try reconnect
W0624 22:24:33.176010    58 pika_repl_client.cc:115] Failed to connect master, Master (10.0.0.1:9221), try reconnect
W0624 22:24:34.135169    18 pika_repl_client_conn.cc:101] Meta Sync Failed: Slave AlreadyExist will keep sending MetaSync msg
I0624 22:24:36.280272    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
I0624 22:24:46.086923    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
W0624 22:24:53.548004    19 pika_repl_client_conn.cc:101] Meta Sync Failed: Slave AlreadyExist will keep sending MetaSync msg
W0624 22:24:53.548195     8 pika_repl_client_conn.cc:101] Meta Sync Failed: Slave AlreadyExist will keep sending MetaSync msg
I0624 22:24:56.093751    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
W0624 22:24:56.133637     9 pika_repl_client_conn.cc:101] Meta Sync Failed: Slave AlreadyExist will keep sending MetaSync msg
I0624 22:25:06.100692    58 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.0.0.1:9221)
W0624 22:25:06.140516    10 pika_repl_client_conn.cc:101] Meta Sync Failed: Slave AlreadyExist will keep sending MetaSync msg

谢谢,祝您工作开心!

c93614 avatar Jun 25 '21 02:06 c93614