sync always hand shaking
问题描述(Issue Description)
从 cluster mode 的 redis 同步数据到 AWS ElastiCache。源端是自己部署的 redis
2024-04-12 09:55:37 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-1, hand shaking 2024-04-12 09:55:42 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-2, hand shaking 2024-04-12 09:55:47 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-0, hand shaking 2024-04-12 09:55:52 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-1, hand shaking 2024-04-12 09:55:57 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-2, hand shaking 2024-04-12 09:56:02 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-0, hand shaking 2024-04-12 09:56:07 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-1, hand shaking 2024-04-12 09:56:12 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-2, hand shaking 2024-04-12 09:56:17 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-0, hand shaking 2024-04-12 09:56:22 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-1, hand shaking 2024-04-12 09:56:27 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-2, hand shaking 2024-04-12 09:56:32 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-0, hand shaking 2024-04-12 09:56:37 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-1, hand shaking 2024-04-12 09:56:42 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-2, hand shaking 2024-04-12 09:56:47 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-0, hand shaking 2024-04-12 09:56:52 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-1, hand shaking 2024-04-12 09:56:57 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-2, hand shaking 2024-04-12 09:57:02 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-0, hand shaking 2024-04-12 09:57:07 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-1, hand shaking 2024-04-12 09:57:12 INF read_count=[0], read_ops=[0.00], write_count=[0], write_ops=[0.00], src-2, hand shaking
Please provide a brief description of the issue you encountered.
环境信息(Environment)
- RedisShake 版本(RedisShake Version):4.0.5
- Redis 源端版本(Redis Source Version):6.2.7
- Redis 目的端版本(Redis Destination Version):6.2.6
- Redis 部署方式(standalone/cluster/sentinel):cluster
- 是否在云服务商实例上部署(Deployed on Cloud Provider): 否
日志信息(Logs)
如果有错误日志或其他相关日志,请在这里提供。
If there are any error logs or other relevant logs, please provide them here.
其他信息(Additional Information)
请提供任何其他相关的信息,如配置文件、错误信息或截图等。
Please provide any additional information, such as configuration files, error messages, or screenshots.
我也遇到了这个问题,目前看来是psync后一直没有收到+<reply>,导致一直卡死在循环里,即使source实际已经完成bgsave。复现方法是对一个规格较大的实例启动同步后,快速重启同步(在source没完成bgsave前,理论上在同步前手动触发bgsave也可以复现)。目前是在psync前判断rdb_bgsave_in_progress:0以及psync后加了个计时器30分钟内没收到回包就panic掉重新进行同步来规避这个问题。
https://github.com/tair-opensource/RedisShake/blob/v4/internal/reader/sync_standalone_reader.go#L136