RedisShake icon indicating copy to clipboard operation
RedisShake copied to clipboard

SYNC同步out of memory问题

Open niduyi opened this issue 1 year ago • 6 comments

  • [ ] 请确保已经看过 wiki:https://github.com/alibaba/RedisShake/wiki
  • [ ] 请确保已经学习过 Markdown 语法,良好的排版有助于维护人员了解你的问题
  • [ ] 请在此提供足够的信息供社区维护人员排查问题
  • [ ] 请在提交 issue 前删除此模板中多余的文字,包括这几句话

问题描述 通过sync和swap_db.lua进行指定db同步动作,但是出现out memory异常;源库used_memory_human:7G,rdb文件:3G 目标配置最大内存可用9G

redis-shake 的日志:

2023-05-30 10:17:16 INF RDB resize db. db_size=[1033634], expire_size=[1031535] 2023-05-30 10:17:19 INF syncing rdb. percent=[0.14]%, allowOps=[0.00], disallowOps=[14383.40], entryId=[71916], InQueueEntriesCount=[0], unansweredBytesCount=[0]bytes, rdbFileSize=[3.367]G, rdbSendSize=[0.005]G 2023-05-30 10:17:24 INF syncing rdb. percent=[0.14]%, allowOps=[0.00], disallowOps=[0.00], entryId=[71916], InQueueEntriesCount=[0], unansweredBytesCount=[0]bytes, rdbFileSize=[3.367]G, rdbSendSize=[0.005]G 2023-05-30 10:17:29 INF syncing rdb. percent=[0.14]%, allowOps=[0.00], disallowOps=[0.00], entryId=[71916], InQueueEntriesCount=[0], unansweredBytesCount=[0]bytes, rdbFileSize=[3.367]G, rdbSendSize=[0.005]G 2023-05-30 10:17:34 INF syncing rdb. percent=[0.14]%, allowOps=[0.00], disallowOps=[0.00], entryId=[71916], InQueueEntriesCount=[0], unansweredBytesCount=[0]bytes, rdbFileSize=[3.367]G, rdbSendSize=[0.005]G fatal error: runtime: out of memory

runtime stack: runtime.throw({0x7fb067?, 0x0?}) runtime/panic.go:1047 +0x5d fp=0x7f30efa7bcc8 sp=0x7f30efa7bc98 pc=0x43c1bd runtime.sysMapOS(0xc0b1400000, 0x400000?) runtime/mem_linux.go:187 +0x11b fp=0x7f30efa7bd10 sp=0x7f30efa7bcc8 pc=0x41d47b runtime.sysMap(0x4321da?, 0xcd4210?, 0x1?) runtime/mem.go:142 +0x35 fp=0x7f30efa7bd40 sp=0x7f30efa7bd10 pc=0x41ce55 runtime.(*mheap).grow(0xcd4200, 0x2000?) runtime/mheap.go:1468 +0x23d fp=0x7f30efa7bdb0 sp=0x7f30efa7bd40 pc=0x42da1d runtime.(*mheap).allocSpan(0xcd4200, 0x2, 0x0, 0x4b) runtime/mheap.go:1199 +0x1be fp=0x7f30efa7be48 sp=0x7f30efa7bdb0 pc=0x42d15e runtime.(*mheap).alloc.func1() runtime/mheap.go:918 +0x65 fp=0x7f30efa7be90 sp=0x7f30efa7be48 pc=0x42cbe5 runtime.systemstack() runtime/asm_amd64.s:492 +0x49 fp=0x7f30efa7be98 sp=0x7f30efa7be90 pc=0x46db29

goroutine 19 [running]: runtime.systemstack_switch() runtime/asm_amd64.s:459 fp=0xc000058860 sp=0xc000058858 pc=0x46dac0 runtime.(*mheap).alloc(0x7f30d97bbf00?, 0x435e70?, 0x10?) runtime/mheap.go:912 +0x65 fp=0xc0000588a8 sp=0xc000058860 pc=0x42cb25 runtime.(*mcentral).grow(0x4000?) runtime/mcentral.go:244 +0x5b fp=0xc0000588f0 sp=0xc0000588a8 pc=0x41c73b runtime.(*mcentral).cacheSpan(0xce7c80) runtime/mcentral.go:164 +0x306 fp=0xc000058948 sp=0xc0000588f0 pc=0x41c586 runtime.(*mcache).refill(0x7f31198f4f18, 0x4b?) runtime/mcache.go:181 +0x152 fp=0xc000058988 sp=0xc000058948 pc=0x41bc52 runtime.(*mcache).nextFree(0x7f31198f4f18, 0x4b) runtime/malloc.go:819 +0x85 fp=0xc0000589d0 sp=0xc000058988 pc=0x411b05 runtime.mallocgc(0x624, 0x78b1c0, 0x1) runtime/malloc.go:1018 +0x4c8 fp=0xc000058a48 sp=0xc0000589d0 pc=0x412168 runtime.makeslice(0x6aa601?, 0xc000058ac8?, 0x6aa075?) runtime/slice.go:103 +0x52 fp=0xc000058a70 sp=0xc000058a48 pc=0x453972 github.com/alibaba/RedisShake/internal/rdb/structure.lzfDecompress({0xc0b13f3680, 0x444, 0x20?}, 0x624) github.com/alibaba/RedisShake/internal/rdb/structure/string.go:46 +0x3c fp=0xc000058ad8 sp=0xc000058a70 pc=0x6ab6fc github.com/alibaba/RedisShake/internal/rdb/structure.ReadString({0x895100, 0xc00052c880}) github.com/alibaba/RedisShake/internal/rdb/structure/string.go:37 +0x17c fp=0xc000058b40 sp=0xc000058ad8 pc=0x6ab5fc github.com/alibaba/RedisShake/internal/rdb/types.(*HashObject).readHash(0xc000012ff0, {0x895100, 0xc00052c880}) github.com/alibaba/RedisShake/internal/rdb/types/hash.go:35 +0x7e fp=0xc000058ba0 sp=0xc000058b40 pc=0x6ac55e github.com/alibaba/RedisShake/internal/rdb/types.(*HashObject).LoadFromBuffer(0xc000012ff0, {0x895100, 0xc00052c880}, {0xc000028168?, 0x20?}, 0x4) github.com/alibaba/RedisShake/internal/rdb/types/hash.go:19 +0xc5 fp=0xc000058be8 sp=0xc000058ba0 pc=0x6ac405 github.com/alibaba/RedisShake/internal/rdb/types.ParseObject({0x895100, 0xc00052c880}, 0x4, {0xc000028168, 0x15}) github.com/alibaba/RedisShake/internal/rdb/types/interface.go:87 +0x1bc fp=0xc000058c68 sp=0xc000058be8 pc=0x6acc7c github.com/alibaba/RedisShake/internal/rdb.(*Loader).parseRDBEntry(0xc000130a80, 0xc00008e1e0) github.com/alibaba/RedisShake/internal/rdb/rdb.go:156 +0x7ab fp=0xc000058e00 sp=0xc000058c68 pc=0x7146ab github.com/alibaba/RedisShake/internal/rdb.(*Loader).ParseRDB(0xc000130a80) github.com/alibaba/RedisShake/internal/rdb/rdb.go:85 +0x3f7 fp=0xc000058f38 sp=0xc000058e00 pc=0x713cd7 github.com/alibaba/RedisShake/internal/reader.(*psyncReader).sendRDB(0xc00007e0a0) github.com/alibaba/RedisShake/internal/reader/psync.go:214 +0xbd fp=0xc000058fa8 sp=0xc000058f38 pc=0x716f9d github.com/alibaba/RedisShake/internal/reader.(*psyncReader).StartRead.func1() github.com/alibaba/RedisShake/internal/reader/psync.go:49 +0xfa fp=0xc000058fe0 sp=0xc000058fa8 pc=0x715e7a runtime.goexit() runtime/asm_amd64.s:1594 +0x1 fp=0xc000058fe8 sp=0xc000058fe0 pc=0x46fce1 created by github.com/alibaba/RedisShake/internal/reader.(*psyncReader).StartRead github.com/alibaba/RedisShake/internal/reader/psync.go:43 +0x8d


源端 Redis 版本:主从哨兵版本 3.2.1

目的端 Redis 版本:主从哨兵版本 5.0

niduyi avatar May 30 '23 02:05 niduyi

如果源端是7Gb,目标端是否需要2倍源端的大小?且我只同步指定的DB,可能指定的DB只有100M,难道目标端也要2倍源端大小的内存?

niduyi avatar May 30 '23 02:05 niduyi

是 shake oom了,和目的端没关系。请问 shake 可用内存多大?

suxb201 avatar May 31 '23 03:05 suxb201

是 shake oom了,和目的端没关系。请问 shake 可用内存多大?

请问下,从rdb的备份文件通过shake-v3版本恢复到目标集群数据,比如rdb文件500MB,那么shake进程需要多大内存,cpu至少几核,尽量能达到shake能够合理分配资源的效果,也不会oom

songdechao avatar Mar 01 '24 09:03 songdechao

@songdechao 使用最新的代码自己编译,能解决这个问题,优化了内存占用。

suxb201 avatar Mar 01 '24 09:03 suxb201

@songdechao 使用最新的代码自己编译,能解决这个问题,优化了内存占用。

哦那是v4版本了?还有我现在能根据rdb大小估算shake的内存大概需求吗?比如rdb500MB,shake进程2G可以吗

songdechao avatar Mar 01 '24 09:03 songdechao

@songdechao 很难估计,如果没有超大 list、zset 等数据,是差不多的。如果有的话,内存膨胀会非常厉害。

suxb201 avatar Mar 01 '24 09:03 suxb201