tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

ticdc restart after network partition recover from one of ticdc

Open Lily2025 opened this issue 1 year ago • 6 comments

What did you do?

1、run workload storage:"s3://benchmark/sysbench_64_7000w" subType:"oltp_read_write" db:"sysbench_64_7000w" tableNum:64 tableSize:70000000 threads:32 ignoreErrors:"2013,1213,1105,1205,8022,8028,9004,9007,1062" 2、inject network partition between one of ticdc and other pods

What did you expect to see?

What did you see instead?

1、one of ticdc restart

[2024/02/28 19:08:37.410 +08:00] [ERROR] [tso_dispatcher.go:562] ["[tso] update connection contexts failed"] [dc=global] [error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp [10.200.49.116:2379](http://10.200.49.116:2379/): i/o timeout\""]
[2024/02/28 19:08:37.410 +08:00] [ERROR] [pd.go:228] ["updateTS error"] [txnScope=global] [error="context canceled"] [errorVerbose="context canceled\[ngithub.com/tikv/pd/client.(*tsoRequest).Wait\n\tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:118\ngithub.com/tikv/pd/client.(*client).GetTS\n\tgithub.com/tikv/pd/[email protected]/client.go:803\ngithub.com/tikv/client-go/v2/util.InterceptedPDClient.GetTS\n\tgithub.com/tikv/client-go/[email protected]/util/pd_interceptor.go:81\ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).getTimestamp\n\tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:147\ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).updateTS.func1\n\tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:226\nsync.(*Map).Range\n\tsync/map.go:476\ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).updateTS\n\tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:224\nruntime.goexit\n\truntime/asm_amd64.s:1650\ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).getTimestamp\n\tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:152\ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).updateTS.func1\n\tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:226\nsync.(*Map).Range\n\tsync/map.go:476\ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).updateTS\n\tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:224\nruntime.goexit\n\truntime/asm_amd64.s:1650](http://ngithub.com/tikv/pd/client.(*tsoRequest).Wait/n/tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:118/ngithub.com/tikv/pd/client.(*client).GetTS/n/tgithub.com/tikv/pd/[email protected]/client.go:803/ngithub.com/tikv/client-go/v2/util.InterceptedPDClient.GetTS/n/tgithub.com/tikv/client-go/[email protected]/util/pd_interceptor.go:81/ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).getTimestamp/n/tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:147/ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).updateTS.func1/n/tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:226/nsync.(*Map).Range/n/tsync/map.go:476/ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).updateTS/n/tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:224/nruntime.goexit/n/truntime/asm_amd64.s:1650/ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).getTimestamp/n/tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:152/ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).updateTS.func1/n/tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:226/nsync.(*Map).Range/n/tsync/map.go:476/ngithub.com/tikv/client-go/v2/oracle/oracles.(*pdOracle).updateTS/n/tgithub.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:224/nruntime.goexit/n/truntime/asm_amd64.s:1650)"]
[2024/02/28 19:08:37.410 +08:00] [INFO] [tso_dispatcher.go:344] ["[tso] exit tso dispatcher"] [dc-location=global]
[2024/02/28 19:08:37.410 +08:00] [INFO] [tso_client.go:139] ["close tso client"]
[2024/02/28 19:08:37.410 +08:00] [INFO] [tso_client.go:150] ["tso client is closed"]
[2024/02/28 19:08:37.410 +08:00] [INFO] [pd_service_discovery.go:664] ["[pd] close pd service discovery client"]
[2024/02/28 19:08:37.410 +08:00] [INFO] [client.go:319] ["[pd] http client closed"] [source=tikv-driver]
[2024/02/28 19:08:37.413 +08:00] [WARN] [upstream.go:299] ["etcd session close failed"] [error="etcdserver: requested lease not found"]
[2024/02/28 19:08:37.413 +08:00] [INFO] [upstream.go:305] ["upstream closed"] [upstreamID=7340490029962833542]
[2024/02/28 19:08:38.370 +08:00] [ERROR] [pd_service_discovery.go:613] ["[pd] failed to update service mode"] [urls="[http://tc-pd-0.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379]"] [error="[PD:client:ErrClientGetClusterInfo]error:rpc error: code = DeadlineExceeded desc = context deadline exceeded target:[tc-pd-2.tc](http://tc-pd-2.tc/)-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379 status:READY: error:rpc error: code = DeadlineExceeded desc = context deadline exceeded target:[tc-pd-2.tc](http://tc-pd-2.tc/)-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379 status:READY"]
[2024/02/28 19:08:38.379 +08:00] [WARN] [server.go:315] ["etcd health check: cannot collect all members"] [error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"] [errorVerbose="rpc error: code = DeadlineExceeded desc = context deadline exceeded\[ngithub.com/tikv/pd/client.(*client).respForErr\n\tgithub.com/tikv/pd/[email protected]/client.go:1550\ngithub.com/tikv/pd/client.(*client).GetAllMembers\n\tgithub.com/tikv/pd/[email protected]/client.go:735\ngithub.com/pingcap/tiflow/pkg/pdutil.(*pdAPIClient).CollectMemberEndpoints\n\tgithub.com/pingcap/tiflow/pkg/pdutil/api_client.go:346\ngithub.com/pingcap/tiflow/cdc/server.(*server).upstreamPDHealthChecker\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:313\ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func1\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:347\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\truntime/asm_amd64.s:1650](http://ngithub.com/tikv/pd/client.(*client).respForErr/n/tgithub.com/tikv/pd/[email protected]/client.go:1550/ngithub.com/tikv/pd/client.(*client).GetAllMembers/n/tgithub.com/tikv/pd/[email protected]/client.go:735/ngithub.com/pingcap/tiflow/pkg/pdutil.(*pdAPIClient).CollectMemberEndpoints/n/tgithub.com/pingcap/tiflow/pkg/pdutil/api_client.go:346/ngithub.com/pingcap/tiflow/cdc/server.(*server).upstreamPDHealthChecker/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:313/ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func1/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:347/ngolang.org/x/sync/errgroup.(*Group).Go.func1/n/tgolang.org/x/[email protected]/errgroup/errgroup.go:78/nruntime.goexit/n/truntime/asm_amd64.s:1650)"]
[2024/02/28 19:08:47.414 +08:00] [WARN] [check.go:88] ["check TiKV version failed"] [error="[CDC:ErrGetAllStoresFailed]get stores from pd failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded"] [errorVerbose="[CDC:ErrGetAllStoresFailed]get stores from pd failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded\[ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/[email protected]/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/[email protected]/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/errors.WrapError\n\tgithub.com/pingcap/tiflow/pkg/errors/helper.go:34\ngithub.com/pingcap/tiflow/pkg/version.CheckStoreVersion\n\tgithub.com/pingcap/tiflow/pkg/version/check.go:209\ngithub.com/pingcap/tiflow/pkg/version.CheckClusterVersion\n\tgithub.com/pingcap/tiflow/pkg/version/check.go:83\ngithub.com/pingcap/tiflow/pkg/upstream.initUpstream\n\tgithub.com/pingcap/tiflow/pkg/upstream/upstream.go:179\ngithub.com/pingcap/tiflow/pkg/upstream.(*Manager).AddDefaultUpstream\n\tgithub.com/pingcap/tiflow/pkg/upstream/manager.go:116\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).reset\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:250\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).run\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:333\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).Run\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:308\ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func6\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:372\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\truntime/asm_amd64.s:1650](http://ngithub.com/pingcap/errors.AddStack/n/tgithub.com/pingcap/[email protected]/errors.go:174/ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs/n/tgithub.com/pingcap/[email protected]/normalize.go:164/ngithub.com/pingcap/tiflow/pkg/errors.WrapError/n/tgithub.com/pingcap/tiflow/pkg/errors/helper.go:34/ngithub.com/pingcap/tiflow/pkg/version.CheckStoreVersion/n/tgithub.com/pingcap/tiflow/pkg/version/check.go:209/ngithub.com/pingcap/tiflow/pkg/version.CheckClusterVersion/n/tgithub.com/pingcap/tiflow/pkg/version/check.go:83/ngithub.com/pingcap/tiflow/pkg/upstream.initUpstream/n/tgithub.com/pingcap/tiflow/pkg/upstream/upstream.go:179/ngithub.com/pingcap/tiflow/pkg/upstream.(*Manager).AddDefaultUpstream/n/tgithub.com/pingcap/tiflow/pkg/upstream/manager.go:116/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).reset/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:250/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).run/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:333/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).Run/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:308/ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func6/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:372/ngolang.org/x/sync/errgroup.(*Group).Go.func1/n/tgolang.org/x/[email protected]/errgroup/errgroup.go:78/nruntime.goexit/n/truntime/asm_amd64.s:1650)"]
[2024/02/28 19:08:47.427 +08:00] [INFO] [pd_service_discovery.go:1016] ["[pd] update member urls"] [old-urls="[http://tc-pd-0.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd:2379]"] [new-urls="[http://tc-pd-0.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379]"]
[2024/02/28 19:08:47.427 +08:00] [INFO] [pd_service_discovery.go:1043] ["[pd] switch leader"] [new-leader=http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379/] [old-leader=]
[2024/02/28 19:08:47.427 +08:00] [INFO] [pd_service_discovery.go:525] ["[pd] init cluster id"] [cluster-id=7340490029962833542]
[2024/02/28 19:08:47.427 +08:00] [INFO] [client.go:606] ["[pd] changing service mode"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=PD_SVC_MODE]
[2024/02/28 19:08:47.427 +08:00] [INFO] [tso_client.go:231] ["[tso] switch dc tso global allocator serving address"] [dc-location=global] [new-address=http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379/]
[2024/02/28 19:08:47.428 +08:00] [INFO] [tso_dispatcher.go:323] ["[tso] tso dispatcher created"] [dc-location=global]
[2024/02/28 19:08:47.428 +08:00] [INFO] [client.go:654] ["[pd] service mode changed"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=PD_SVC_MODE]
[2024/02/28 19:08:47.429 +08:00] [INFO] [tikv_driver.go:200] ["using API V1."]
[2024/02/28 19:08:47.429 +08:00] [INFO] [tso_dispatcher.go:441] ["[tso] tso stream is not ready"] [dc=global]
[2024/02/28 19:08:48.371 +08:00] [ERROR] [pd_service_discovery.go:613] ["[pd] failed to update service mode"] [urls="[http://tc-pd-0.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379]"] [error="[PD:client:ErrClientGetClusterInfo]error:rpc error: code = DeadlineExceeded desc = context deadline exceeded target:[tc-pd-2.tc](http://tc-pd-2.tc/)-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379 status:READY: error:rpc error: code = DeadlineExceeded desc = context deadline exceeded target:[tc-pd-2.tc](http://tc-pd-2.tc/)-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379 status:READY"]
[2024/02/28 19:08:48.380 +08:00] [WARN] [server.go:315] ["etcd health check: cannot collect all members"] [error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"] [errorVerbose="rpc error: code = DeadlineExceeded desc = context deadline exceeded\[ngithub.com/tikv/pd/client.(*client).respForErr\n\tgithub.com/tikv/pd/[email protected]/client.go:1550\ngithub.com/tikv/pd/client.(*client).GetAllMembers\n\tgithub.com/tikv/pd/[email protected]/client.go:735\ngithub.com/pingcap/tiflow/pkg/pdutil.(*pdAPIClient).CollectMemberEndpoints\n\tgithub.com/pingcap/tiflow/pkg/pdutil/api_client.go:346\ngithub.com/pingcap/tiflow/cdc/server.(*server).upstreamPDHealthChecker\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:313\ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func1\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:347\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\truntime/asm_amd64.s:1650](http://ngithub.com/tikv/pd/client.(*client).respForErr/n/tgithub.com/tikv/pd/[email protected]/client.go:1550/ngithub.com/tikv/pd/client.(*client).GetAllMembers/n/tgithub.com/tikv/pd/[email protected]/client.go:735/ngithub.com/pingcap/tiflow/pkg/pdutil.(*pdAPIClient).CollectMemberEndpoints/n/tgithub.com/pingcap/tiflow/pkg/pdutil/api_client.go:346/ngithub.com/pingcap/tiflow/cdc/server.(*server).upstreamPDHealthChecker/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:313/ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func1/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:347/ngolang.org/x/sync/errgroup.(*Group).Go.func1/n/tgolang.org/x/[email protected]/errgroup/errgroup.go:78/nruntime.goexit/n/truntime/asm_amd64.s:1650)"]
[2024/02/28 19:08:57.429 +08:00] [ERROR] [tso_dispatcher.go:202] ["[tso] tso request is canceled due to timeout"] [dc-location=global] [error="[PD:client:ErrClientGetTSOTimeout]get TSO timeout"]
[2024/02/28 19:08:57.429 +08:00] [ERROR] [tso_dispatcher.go:498] ["[tso] getTS error after processing requests"] [dc-location=global] [stream-addr=http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379/] [error="[PD:client:ErrClientGetTSO]get TSO failed, %v: rpc error: code = Canceled desc = context canceled"]
[2024/02/28 19:08:57.429 +08:00] [ERROR] [capture.go:335] ["reset capture failed"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\[ngithub.com/tikv/pd/client.(*pdTSOStream).processRequests\n\tgithub.com/tikv/pd/[email protected]/tso_stream.go:149\ngithub.com/tikv/pd/client.(*tsoClient).processRequests\n\tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:763\ngithub.com/tikv/pd/client.(*tsoClient).handleDispatcher\n\tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:488\nruntime.goexit\n\truntime/asm_amd64.s:1650\ngithub.com/tikv/pd/client.(*tsoRequest).Wait\n\tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:104\ngithub.com/tikv/pd/client.(*client).GetTS\n\tgithub.com/tikv/pd/[email protected]/client.go:803\ngithub.com/pingcap/tiflow/pkg/pdutil.NewClock\n\tgithub.com/pingcap/tiflow/pkg/pdutil/clock.go:62\ngithub.com/pingcap/tiflow/pkg/upstream.initUpstream\n\tgithub.com/pingcap/tiflow/pkg/upstream/upstream.go:197\ngithub.com/pingcap/tiflow/pkg/upstream.(*Manager).AddDefaultUpstream\n\tgithub.com/pingcap/tiflow/pkg/upstream/manager.go:116\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).reset\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:250\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).run\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:333\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).Run\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:308\ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func6\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:372\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\truntime/asm_amd64.s:1650](http://ngithub.com/tikv/pd/client.(*pdTSOStream).processRequests/n/tgithub.com/tikv/pd/[email protected]/tso_stream.go:149/ngithub.com/tikv/pd/client.(*tsoClient).processRequests/n/tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:763/ngithub.com/tikv/pd/client.(*tsoClient).handleDispatcher/n/tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:488/nruntime.goexit/n/truntime/asm_amd64.s:1650/ngithub.com/tikv/pd/client.(*tsoRequest).Wait/n/tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:104/ngithub.com/tikv/pd/client.(*client).GetTS/n/tgithub.com/tikv/pd/[email protected]/client.go:803/ngithub.com/pingcap/tiflow/pkg/pdutil.NewClock/n/tgithub.com/pingcap/tiflow/pkg/pdutil/clock.go:62/ngithub.com/pingcap/tiflow/pkg/upstream.initUpstream/n/tgithub.com/pingcap/tiflow/pkg/upstream/upstream.go:197/ngithub.com/pingcap/tiflow/pkg/upstream.(*Manager).AddDefaultUpstream/n/tgithub.com/pingcap/tiflow/pkg/upstream/manager.go:116/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).reset/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:250/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).run/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:333/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).Run/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:308/ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func6/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:372/ngolang.org/x/sync/errgroup.(*Group).Go.func1/n/tgolang.org/x/[email protected]/errgroup/errgroup.go:78/nruntime.goexit/n/truntime/asm_amd64.s:1650)"]
[2024/02/28 19:08:57.430 +08:00] [INFO] [capture.go:328] ["the capture routine has exited"]
[2024/02/28 19:08:57.430 +08:00] [WARN] [server.go:315] ["etcd health check: cannot collect all members"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\[ngithub.com/tikv/pd/client.(*client).respForErr\n\tgithub.com/tikv/pd/[email protected]/client.go:1550\ngithub.com/tikv/pd/client.(*client).GetAllMembers\n\tgithub.com/tikv/pd/[email protected]/client.go:735\ngithub.com/pingcap/tiflow/pkg/pdutil.(*pdAPIClient).CollectMemberEndpoints\n\tgithub.com/pingcap/tiflow/pkg/pdutil/api_client.go:346\ngithub.com/pingcap/tiflow/cdc/server.(*server).upstreamPDHealthChecker\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:313\ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func1\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:347\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\truntime/asm_amd64.s:1650](http://ngithub.com/tikv/pd/client.(*client).respForErr/n/tgithub.com/tikv/pd/[email protected]/client.go:1550/ngithub.com/tikv/pd/client.(*client).GetAllMembers/n/tgithub.com/tikv/pd/[email protected]/client.go:735/ngithub.com/pingcap/tiflow/pkg/pdutil.(*pdAPIClient).CollectMemberEndpoints/n/tgithub.com/pingcap/tiflow/pkg/pdutil/api_client.go:346/ngithub.com/pingcap/tiflow/cdc/server.(*server).upstreamPDHealthChecker/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:313/ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func1/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:347/ngolang.org/x/sync/errgroup.(*Group).Go.func1/n/tgolang.org/x/[email protected]/errgroup/errgroup.go:78/nruntime.goexit/n/truntime/asm_amd64.s:1650)"]
[2024/02/28 19:08:57.430 +08:00] [ERROR] [server.go:298] ["http server error"] [error="[CDC:ErrServeHTTP]serve http error: mux: server closed"] [errorVerbose="[CDC:ErrServeHTTP]serve http error: mux: server closed\[ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/[email protected]/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/[email protected]/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/errors.WrapError\n\tgithub.com/pingcap/tiflow/pkg/errors/helper.go:34\ngithub.com/pingcap/tiflow/cdc/server.(*server).startStatusHTTP.func1\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:298\nruntime.goexit\n\truntime/asm_amd64.s:1650](http://ngithub.com/pingcap/errors.AddStack/n/tgithub.com/pingcap/[email protected]/errors.go:174/ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs/n/tgithub.com/pingcap/[email protected]/normalize.go:164/ngithub.com/pingcap/tiflow/pkg/errors.WrapError/n/tgithub.com/pingcap/tiflow/pkg/errors/helper.go:34/ngithub.com/pingcap/tiflow/cdc/server.(*server).startStatusHTTP.func1/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:298/nruntime.goexit/n/truntime/asm_amd64.s:1650)"]
[2024/02/28 19:08:57.430 +08:00] [WARN] [server.go:139] ["cdc server exits with error"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\[ngithub.com/tikv/pd/client.(*pdTSOStream).processRequests\n\tgithub.com/tikv/pd/[email protected]/tso_stream.go:149\ngithub.com/tikv/pd/client.(*tsoClient).processRequests\n\tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:763\ngithub.com/tikv/pd/client.(*tsoClient).handleDispatcher\n\tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:488\nruntime.goexit\n\truntime/asm_amd64.s:1650\ngithub.com/tikv/pd/client.(*tsoRequest).Wait\n\tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:104\ngithub.com/tikv/pd/client.(*client).GetTS\n\tgithub.com/tikv/pd/[email protected]/client.go:803\ngithub.com/pingcap/tiflow/pkg/pdutil.NewClock\n\tgithub.com/pingcap/tiflow/pkg/pdutil/clock.go:62\ngithub.com/pingcap/tiflow/pkg/upstream.initUpstream\n\tgithub.com/pingcap/tiflow/pkg/upstream/upstream.go:197\ngithub.com/pingcap/tiflow/pkg/upstream.(*Manager).AddDefaultUpstream\n\tgithub.com/pingcap/tiflow/pkg/upstream/manager.go:116\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).reset\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:250\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).run\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:333\ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).Run\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:308\ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func6\n\tgithub.com/pingcap/tiflow/cdc/server/server.go:372\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/[email protected]/errgroup/errgroup.go:78\nruntime.goexit\n\truntime/asm_amd64.s:1650](http://ngithub.com/tikv/pd/client.(*pdTSOStream).processRequests/n/tgithub.com/tikv/pd/[email protected]/tso_stream.go:149/ngithub.com/tikv/pd/client.(*tsoClient).processRequests/n/tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:763/ngithub.com/tikv/pd/client.(*tsoClient).handleDispatcher/n/tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:488/nruntime.goexit/n/truntime/asm_amd64.s:1650/ngithub.com/tikv/pd/client.(*tsoRequest).Wait/n/tgithub.com/tikv/pd/[email protected]/tso_dispatcher.go:104/ngithub.com/tikv/pd/client.(*client).GetTS/n/tgithub.com/tikv/pd/[email protected]/client.go:803/ngithub.com/pingcap/tiflow/pkg/pdutil.NewClock/n/tgithub.com/pingcap/tiflow/pkg/pdutil/clock.go:62/ngithub.com/pingcap/tiflow/pkg/upstream.initUpstream/n/tgithub.com/pingcap/tiflow/pkg/upstream/upstream.go:197/ngithub.com/pingcap/tiflow/pkg/upstream.(*Manager).AddDefaultUpstream/n/tgithub.com/pingcap/tiflow/pkg/upstream/manager.go:116/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).reset/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:250/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).run/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:333/ngithub.com/pingcap/tiflow/cdc/capture.(*captureImpl).Run/n/tgithub.com/pingcap/tiflow/cdc/capture/capture.go:308/ngithub.com/pingcap/tiflow/cdc/server.(*server).run.func6/n/tgithub.com/pingcap/tiflow/cdc/server/server.go:372/ngolang.org/x/sync/errgroup.(*Group).Go.func1/n/tgolang.org/x/[email protected]/errgroup/errgroup.go:78/nruntime.goexit/n/truntime/asm_amd64.s:1650)"]
[2024/02/28 19:08:57.430 +08:00] [INFO] [capture.go:707] ["message router closed"] [captureID=a277c9b2-c0b6-4ef0-aa9d-3d51b50cd83f]
[2024/02/28 19:08:57.432 +08:00] [INFO] [server.go:424] ["sort engine manager closed"] [duration=2.032547ms]
[2024/02/28 19:08:57.432 +08:00] [INFO] [pd_service_discovery.go:577] ["[pd] exit member loop due to context canceled"]
[2024/02/28 19:08:57.432 +08:00] [INFO] [resource_manager_client.go:295] ["[resource manager] exit resource token dispatcher"]
[2024/02/28 19:08:57.432 +08:00] [INFO] [tso_dispatcher.go:240] ["exit tso dispatcher loop"]
[2024/02/28 19:08:57.432 +08:00] [INFO] [tso_dispatcher.go:410] ["[tso] stop fetching the pending tso requests due to context canceled"] [dc-location=global]
[2024/02/28 19:08:57.432 +08:00] [INFO] [tso_dispatcher.go:344] ["[tso] exit tso dispatcher"] [dc-location=global]
[2024/02/28 19:08:57.432 +08:00] [INFO] [tso_dispatcher.go:186] ["exit tso requests cancel loop"]
[2024/02/28 19:08:57.432 +08:00] [ERROR] [pd_service_discovery.go:613] ["[pd] failed to update service mode"] [urls="[http://tc-pd-0.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379]"] [error="[PD:client:ErrClientGetClusterInfo]error:rpc error: code = Canceled desc = context canceled target:[tc-pd-2.tc](http://tc-pd-2.tc/)-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379 status:READY: error:rpc error: code = Canceled desc = context canceled target:[tc-pd-2.tc](http://tc-pd-2.tc/)-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379 status:READY"]
[2024/02/28 19:08:57.432 +08:00] [ERROR] [pd_service_discovery.go:613] ["[pd] failed to update service mode"] [urls="[http://tc-pd-0.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-1.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379,http://tc-pd-2.tc-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379]"] [error="[PD:client:ErrClientGetClusterInfo]error:rpc error: code = Canceled desc = context canceled target:[tc-pd-2.tc](http://tc-pd-2.tc/)-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379 status:READY: error:rpc error: code = Canceled desc = context canceled target:[tc-pd-2.tc](http://tc-pd-2.tc/)-pd-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:2379 status:READY"]
[2024/02/28 19:08:57.432 +08:00] [INFO] [tso_client.go:134] ["closing tso client"]
[2024/02/28 19:08:57.432 +08:00] [INFO] [tso_client.go:139] ["close tso client"]
[2024/02/28 19:08:57.432 +08:00] [INFO] [tso_client.go:150] ["tso client is closed"]
[2024/02/28 19:08:57.432 +08:00] [INFO] [pd_service_discovery.go:664] ["[pd] close pd service discovery client"]
[2024/02/28 19:08:59.752 +08:00] [INFO] [helper.go:54] ["init log"] [file=/var/lib/ticdc/log/ticdc.log] [level=info]
[2024/02/28 19:08:59.752 +08:00] [INFO] [tz.go:34] ["Use the timezone of the TiCDC server machine"] [timezoneName=System] [timezone=Asia/Shanghai]
[2024/02/28 19:08:59.752 +08:00] [INFO] [version.go:47] ["Welcome to Change Data Capture (CDC)"] [release-version=v8.0.0-alpha] [git-hash=25ce29c2a1802bbb4cd26008f322728959a91f7a] [git-branch=heads/refs/tags/v8.0.0-alpha] [utc-build-time="2024-02-27 11:37:29"] [go-version="go version go1.21.6 linux/amd64"] [failpoint-build=false]
[2024/02/28 19:08:59.752 +08:00] [INFO] [server.go:125] ["CDC server created"] [pd="[http://tc-pd:2379/]"] [config="{\"addr\":\"[0.0.0.0:8301](http://0.0.0.0:8301/)\",\"advertise-addr\":\"[tc-ticdc-1.tc](http://tc-ticdc-1.tc/)-ticdc-peer.endless-ha-test-ticdc-tps-7080582-1-976.svc:8301\",\"log-file\":\"/var/lib/ticdc/log/ticdc.log\",\"log-level\":\"info\",\"log\":{\"file\":{\"max-size\":301,\"max-days\":0,\"max-backups\":0},\"error-output\":\"stderr\"},\"data-dir\":\"\",\"gc-ttl\":86400,\"tz\":\"System\",\"capture-session-ttl\":10,\"owner-flush-interval\":50000000,\"processor-flush-interval\":50000000,\"sorter\":{\"sort-dir\":\"/tmp/sorter\",\"cache-size-in-mb\":128},\"security\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"cert-allowed-cn\":null,\"mtls\":false,\"client-user-required\":false,\"client-allowed-user\":null},\"kv-client\":{\"enable-multiplexing\":true,\

Versions of the cluster

git hash:25ce29c2a1802bbb4cd26008f322728959a91f7a

Lily2025 avatar Oct 23 '23 02:10 Lily2025