tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

dm6.5,资源耗尽

Open xc1989xc opened this issue 1 year ago • 6 comments

What did you do?

mysql同步到tidb,源实例400+schema,2.5w+tables

What did you expect to see?

No response

What did you see instead?

启动task报错 ERROR] [subtask.go:218] ["fail to initialize subtask"] [subtask=aaa] [error="[code=42501:class=ha:scope=internal:level=high], Message: fail to initialize unit Sync of subtask aaa: fail to do etcd txn operation: txn commit failed, RawCause: rpc error: code = ResourceExhausted desc = trying to send message larger than max (2157479 vs. 2097152), Workaround: Please check dm-master's node status and the network between this node and dm-master"]

Versions of the cluster

Release Version: v6.5.0 Git Commit Hash: 9e91cff866d240ab6c1737680c17f5c5d0586911 Git Branch: heads/refs/tags/v6.5.0 UTC Build Time: 2022-12-23 08:44:26 Go Version: go version go1.19.3 linux/amd64 Failpoint Build: false

Release Version: v6.5.0 Git Commit Hash: 9e91cff866d240ab6c1737680c17f5c5d0586911 Git Branch: heads/refs/tags/v6.5.0 UTC Build Time: 2022-12-23 08:44:26 Go Version: go version go1.19.3 linux/amd64 Failpoint Build: false

current status of DM cluster (execute query-status <task-name> in dmctl)

"stage": "Paused", "unit": "InvalidUnit", "result": { "isCanceled": false, "errors": [ { "ErrCode": 42501, "ErrClass": "ha", "ErrScope": "internal", "ErrLevel": "high", "Message": "fail to initialize unit Sync of subtask aaa: fail to do etcd txn operation: txn commit failed", "RawCause": "rpc error: code = ResourceExhausted desc = trying to send message larger than max (2157479 vs. 2097152)", "Workaround": "Please check dm-master's node status and the network between this node and dm-master" } ], "detail": null

xc1989xc avatar Jan 06 '24 21:01 xc1989xc

can you provide the task configuration?

GMHDBJD avatar Jan 08 '24 07:01 GMHDBJD

配置肯定没问题啊,配置要是有问题,检查不能报这个错 我的需求就是整个实例迁移 因为是增量,只配置了filters filters: # 上游数据库实例匹配的表的 binlog event filter 规则集 filter-rule-1: # 配置名称 schema-pattern: "*" events: ["all"] # 匹配哪些 event 类型 action: Do # 对与符合匹配规则的 binlog 迁移(Do)还是忽略(Ignore)

xc1989xc avatar Jan 08 '24 09:01 xc1989xc

这个项目还有人在维护么?

xc1989xc avatar Jan 15 '24 02:01 xc1989xc

// 增加调用选项 grpc.WithDefaultCallOptions(grpc.MaxCallRecvMsgSize(recvSize))) grpc.Dial(host, grpc.WithInsecure(), grpc.WithDefaultCallOptions(grpc.MaxCallRecvMsgSize(recvSize))) 服务端报错

// 其实也一样, 设置一下发送 接收的大小 var options = []grpc.ServerOption{ grpc.MaxRecvMsgSize(recvSize), grpc.MaxSendMsgSize(sendSize), } s := grpc.NewServer(options…)

//etcd客户端初始化 func (ec *EtcdCliV3) Init(cfg *EtcdCliConf) (err error) { dialTimtout := cfg.DialTimeout if dialTimtout == 0 { dialTimtout = DEFAULT_DIAL_TIMOUT } etcdConfig := clientv3.Config{ Endpoints: cfg.Endpoints, DialTimeout: dialTimtout, Username: cfg.Username, Password: cfg.Password, DialOptions: []grpc.DialOption{grpc.WithBlock()}, MaxCallSendMsgSize:4 * 1024 * 1024, } if ec.client, err = clientv3.New(etcdConfig); err != nil { err = fmt.Errorf("init etcd cli fail, err: %v", err) return } return }

xc1989xc avatar Jan 15 '24 02:01 xc1989xc

we will fix it in v8.0 and pick to v6.5.8

GMHDBJD avatar Jan 15 '24 05:01 GMHDBJD

/severity moderate

fubinzh avatar Jan 22 '24 01:01 fubinzh