pika icon indicating copy to clipboard operation
pika copied to clipboard

主从同步失败(Rsync send file failed)

Open datavisorchengpeng opened this issue 3 years ago • 3 comments

主从pika版本3.3.6,rsync版本都是3.1.3,待同步的db size约为500GiB。 一直同步失败,检查主节点pika日志后发现是rsync报错,以下是主节点的日志:

pika.WARNING

W0601 08:21:09.862491  4928 pika_server.cc:1124] Partition: db0 RSync send file failed! From: hashes, To: db0/hashes/, At: 10.190.64.249:10222, Error: -1

pika.INFO

I0601 08:01:40.830679 13954 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.190.64.249, Slave port:9222
I0601 08:01:40.830761 13954 pika_server.cc:843] Add New Slave, 10.190.64.249:9222
I0601 08:01:40.970268 13955 pika_repl_server_conn.cc:108] Receive Trysync, Slave ip: 10.190.64.249, Slave port:9222, Partition: db0, filenum: 0, pro_offset: 0
I0601 08:01:40.970388 13955 pika_repl_server_conn.cc:263] Partition: db0 binlog has been purged, may need full sync
I0601 08:01:41.070274 13956 pika_repl_server_conn.cc:324] Handle partition DBSync Request
I0601 08:01:41.070396 13956 pika_rm.cc:79] Add Slave Node, partition: (db0:0), ip_port: 10.190.64.249:9222
I0601 08:01:41.070412 13956 pika_repl_server_conn.cc:347] Partition: db0_0 Handle DBSync Request Success, Session: 2
I0601 08:01:43.120007  4928 pika_partition.cc:376] db0 after prepare bgsave
I0601 08:01:43.120851  4928 pika_partition.cc:379] db0 bgsave_info: path=/mnt/pika/dump/20210601/db0,  filenum=17152, offset=81558951
I0601 08:01:43.495364  4928 pika_partition.cc:385] db0 create new backup finished.
I0601 08:01:43.495504  4928 pika_server.cc:1085] Partition: db0 Start Send files in /mnt/pika/dump/20210601/db0 to 10.190.64.249
I0601 08:06:17.693099 14006 pika_stable_log.cc:151] /mnt/pika/log/log_db0/ Success purge 1 binlog file
I0601 08:19:41.820076 13953 pika_repl_server_thread.cc:29] ServerThread Close Slave Conn, fd: 4428, ip_port: 10.190.64.249:60740
I0601 08:19:41.820185 13953 pika_server.cc:740] Delete Slave Success, ip_port: 10.190.64.249:9222
I0601 08:19:41.820219 13953 pika_rm.cc:90] Remove Slave Node, Partition: (db0:0), ip_port: 10.190.64.249:9222
I0601 08:21:08.628018 13954 pika_repl_server_conn.cc:42] Receive MetaSync, Slave ip: 10.190.64.249, Slave port:9222
I0601 08:21:08.628120 13954 pika_server.cc:843] Add New Slave, 10.190.64.249:9222
I0601 08:21:08.767333 13955 pika_repl_server_conn.cc:108] Receive Trysync, Slave ip: 10.190.64.249, Slave port:9222, Partition: db0, filenum: 0, pro_offset: 0
I0601 08:21:08.767453 13955 pika_repl_server_conn.cc:263] Partition: db0 binlog has been purged, may need full sync
I0601 08:21:08.867314 13956 pika_repl_server_conn.cc:324] Handle partition DBSync Request
I0601 08:21:08.867431 13956 pika_rm.cc:79] Add Slave Node, partition: (db0:0), ip_port: 10.190.64.249:9222
I0601 08:21:08.867488 13956 pika_repl_server_conn.cc:347] Partition: db0_0 Handle DBSync Request Success, Session: 3
I0601 08:21:09.556449  4928 pika_partition.cc:376] db0 after prepare bgsave
I0601 08:21:09.563127  4928 pika_partition.cc:379] db0 bgsave_info: path=/mnt/pika/dump/20210601/db0,  filenum=17153, offset=76849026
I0601 08:21:09.849999  4928 pika_partition.cc:385] db0 create new backup finished.
I0601 08:21:09.854199  4928 pika_server.cc:1085] Partition: db0 Start Send files in /mnt/pika/dump/20210601/db0 to 10.190.64.249

从节点日志pika.INFO

I0601 08:21:08.626871 703054 pika_server.cc:273] Pika Server going to start
I0601 08:21:08.627166 703121 pika_repl_client.cc:146] Try Send Meta Sync Request to Master (10.190.64.234:9222)
I0601 08:21:08.628015 703056 pika_server.cc:618] Mark try connect finish
I0601 08:21:08.628031 703056 pika_repl_client_conn.cc:146] Finish to handle meta sync response
I0601 08:21:08.767128 703057 pika_repl_client_conn.cc:261] Partition: db0 Need To Try DBSync
I0601 08:21:08.867218 703058 pika_repl_client_conn.cc:182] Partition: db0 Need Wait To Sync

从节点未打印WARNING或FATAL日志。

参考之前的问题检查过rsync版本,以及两边的nproc(ulimit -u),主节点的为63458,从节点为256522,请问目前的情况该如何排查错误?

datavisorchengpeng avatar Jun 01 '21 00:06 datavisorchengpeng

nproc

这位兄弟,请问你上面的问题解决了吗?我现在问题和你一样,尝试了N多办法,如果你解决了,能告诉下解决方案不,谢谢!

huzuxing avatar Sep 06 '21 06:09 huzuxing

從節點要連接主節點端口11221,所以主節點要開啓listen

dice2019 avatar Dec 04 '21 01:12 dice2019

nproc

这位兄弟,请问你上面的问题解决了吗?我现在问题和你一样,尝试了N多办法,如果你解决了,能告诉下解决方案不,谢谢!

关闭master pika进程,kill掉rsync进程,删除dbsync目录,重启pika,我这么做解决了。

datavisorchengpeng avatar Sep 02 '22 08:09 datavisorchengpeng

我也是同样问题。一模一样。希望作者能给个解法

huadaonan avatar Nov 19 '22 17:11 huadaonan