tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

CDC lag up to 7min when injecting ha-pdleader-io-delay-1s-last-for-5m, though pd leader transferred soon

Open fubinzh opened this issue 5 months ago • 0 comments

What did you do?

  1. TiDB cluster with CDC changefeed running normally
  2. Inject ha-pdleader-io-delay-1s-last-for-5m (from 2024-09-04 12:37:58 to 22024-09-04 12:42:58)
  3. Check cluster status and CDC lag

What did you expect to see?

CDC lag should be <2min

What did you see instead?

PD leader transfer after chaos injection. But CDC didn't have leader for ~5min, and CDC lag up to ~7min

2024-09-04 12:38:01	
{"container":"pd","log":"[raft.go:771] [\"646d794e12a46726 became leader at term 4\"]","namespace":"uds-cdc-br-scenario-tps-7624385-1-510","level":"INFO","pod":"upstream-pd-0"}

2024-09-04 12:38:26	
{"container":"pd","log":"[server.go:1804] [\"PD leader is ready to serve\"] [leader-name=upstream-pd-0]","namespace":"uds-cdc-br-scenario-tps-7624385-1-510","level":"INFO","pod":"upstream-pd-0"}


2024-09-04 12:38:26	
{"container":"pd","log":"[server.go:1730] [\"campaign PD leader ok\"] [campaign-leader-name=upstream-pd-0]","namespace":"uds-cdc-br-scenario-tps-7624385-1-510","level":"INFO","pod":"upstream-pd-0"}


2024-09-04 12:38:26	
{"container":"pd","log":"[server.go:1704] [\"start to campaign PD leader\"] [campaign-leader-name=upstream-pd-0]","namespace":"uds-cdc-br-scenario-tps-7624385-1-510","level":"INFO","pod":"upstream-pd-0"}

image image image

Versions of the cluster

/cdc version Release Version: v8.2.0 Git Commit Hash: 498e3d3fd1cda4817e70ea50d27dcb157956349d Git Branch: HEAD UTC Build Time: 2024-07-03 02:52:36 Go Version: go version go1.21.10 linux/amd64 Failpoint Build: false

fubinzh avatar Sep 05 '24 09:09 fubinzh