goroutine blocked when get tso
Bug Report
What did you do?
1、enable tidb_enable_tso_follower_proxy
2、enable enable-forwarding
"tidb": "enable-forwarding = true", "tikv": "[pd]\nenable-forwarding = true"
2、run tpcc
3、after run some ha cases
What did you expect to see?
workload can run normally
What did you see instead?
after run some ha cases,query blocked
# 0x2b47e9c [github.com/tikv/pd/client/clients/tso.(*Request).waitCtx+0x1bc](http://github.com/tikv/pd/client/clients/tso.(*Request).waitCtx+0x1bc) /root/go/pkg/mod/[github.com/tikv/pd/[email protected]/clients/tso/request.go:82](http://github.com/tikv/pd/[email protected]/clients/tso/request.go:82)
# 0x2b47cb9 [github.com/tikv/pd/client/clients/tso.(*Request).Wait+0x19](http://github.com/tikv/pd/client/clients/tso.(*Request).Wait+0x19) /root/go/pkg/mod/[github.com/tikv/pd/[email protected]/clients/tso/request.go:73](http://github.com/tikv/pd/[email protected]/clients/tso/request.go:73)
# 0x2b7494e [github.com/tikv/client-go/v2/util.interceptedTsFuture.Wait+0x6e](http://github.com/tikv/client-go/v2/util.interceptedTsFuture.Wait+0x6e) /root/go/pkg/mod/[github.com/tikv/client-go/[email protected]/util/pd_interceptor.go:76](http://github.com/tikv/client-go/[email protected]/util/pd_interceptor.go:76)
# 0x2ce2f61 [github.com/tikv/client-go/v2/oracle/oracles.(*tsFuture).Wait+0x41](http://github.com/tikv/client-go/v2/oracle/oracles.(*tsFuture).Wait+0x41) /root/go/pkg/mod/[github.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:238](http://github.com/tikv/client-go/[email protected]/oracle/oracles/pd.go:238)
# 0x5b39819 [github.com/pingcap/tidb/pkg/session.(*txnFuture).wait+0xd9](http://github.com/pingcap/tidb/pkg/session.(*txnFuture).wait+0xd9) /workspace/source/tidb/pkg/session/txn.go:684
# 0x5b36a09 [github.com/pingcap/tidb/pkg/session.(*LazyTxn).changePendingToValid+0xe9](http://github.com/pingcap/tidb/pkg/session.(*LazyTxn).changePendingToValid+0xe9) /workspace/source/tidb/pkg/session/txn.go:293
# 0x5b38ecf [github.com/pingcap/tidb/pkg/session.(*LazyTxn).Wait+0x16f](http://github.com/pingcap/tidb/pkg/session.(*LazyTxn).Wait+0x16f) /workspace/source/tidb/pkg/session/txn.go:609
# 0x5ae86a6 [github.com/pingcap/tidb/pkg/sessiontxn/isolation.(*baseTxnContextProvider).ActivateTxn+0xc6](http://github.com/pingcap/tidb/pkg/sessiontxn/isolation.(*baseTxnContextProvider).ActivateTxn+0xc6) /workspace/source/tidb/pkg/sessiontxn/isolation/base.go:299
# 0x5ae7d26 [github.com/pingcap/tidb/pkg/sessiontxn/isolation.(*baseTxnContextProvider).OnInitialize+0x566](http://github.com/pingcap/tidb/pkg/sessiontxn/isolation.(*baseTxnContextProvider).OnInitialize+0x566) /workspace/source/tidb/pkg/sessiontxn/isolation/base.go:146
# 0x5b3aabb [github.com/pingcap/tidb/pkg/session.(*txnManager).EnterNewTxn+0x5b](http://github.com/pingcap/tidb/pkg/session.(*txnManager).EnterNewTxn+0x5b) /workspace/source/tidb/pkg/session/txnmanager.go:161
# 0x5a024e5 [github.com/pingcap/tidb/pkg/executor.(*SimpleExec).executeBegin+0x1e5](http://github.com/pingcap/tidb/pkg/executor.(*SimpleExec).executeBegin+0x1e5) /workspace/source/tidb/pkg/executor/simple.go:646
# 0x59fcf64 [github.com/pingcap/tidb/pkg/executor.(*SimpleExec).Next+0x524](http://github.com/pingcap/tidb/pkg/executor.(*SimpleExec).Next+0x524) /workspace/source/tidb/pkg/executor/simple.go:161
# 0x4f075be [github.com/pingcap/tidb/pkg/executor/internal/exec.Next+0x29e](http://github.com/pingcap/tidb/pkg/executor/internal/exec.Next+0x29e) /workspace/source/tidb/pkg/executor/internal/exec/executor.go:460
# 0x5873ced [github.com/pingcap/tidb/pkg/executor.(*ExecStmt).next+0x6d](http://github.com/pingcap/tidb/pkg/executor.(*ExecStmt).next+0x6d) /workspace/source/tidb/pkg/executor/adapter.go:1269
# 0x58719d4 [github.com/pingcap/tidb/pkg/executor.(*ExecStmt).handleNoDelayExecutor+0x3b4](http://github.com/pingcap/tidb/pkg/executor.(*ExecStmt).handleNoDelayExecutor+0x3b4) /workspace/source/tidb/pkg/executor/adapter.go:1018
# 0x5870378 [github.com/pingcap/tidb/pkg/executor.(*ExecStmt).handleNoDelay+0x238](http://github.com/pingcap/tidb/pkg/executor.(*ExecStmt).handleNoDelay+0x238) /workspace/source/tidb/pkg/executor/adapter.go:851
# 0x586e477 [github.com/pingcap/tidb/pkg/executor.(*ExecStmt).Exec+0xed7](http://github.com/pingcap/tidb/pkg/executor.(*ExecStmt).Exec+0xed7) /workspace/source/tidb/pkg/executor/adapter.go:614
....
What version of PD are you using (pd-server -V)?
./pd-server -V Release Version: v9.0.0-beta.1 Edition: Community Git Commit Hash: 110f73c7c28722c88539b6f7fc29248b3adf3010 Git Branch: HEAD UTC Build Time: 2025-03-17 10:29:29 2025-03-20T09:06:17.779+0800 INFO k8s/client.go:135 it should be noted that a long-running command will not be interrupted even the use case has ended. For more information, please refer to https://github.com/pingcap/test-infra/discussions/129 ./tidb-server -V Release Version: v9.0.0-beta.1 Edition: Community Git Commit Hash: dd701afad7b2781ea92265f4d5d68c3eb28bcfdb Git Branch: HEAD UTC Build Time: 2025-03-19 15:19:44 GoVersion: go1.23.7 Race Enabled: false Check Table Before Drop: false Store: unistore 2025-03-20T09:06:19.747+0800
/type bugrleungx
@Lily2025: The label(s) type/bugrleungx cannot be applied, because the repository doesn't have them.
In response to this:
/type bugrleungx
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.
/severity major /assign rleungx
/remove-severity major /severity critical
The trigger condition:
- The cluster enabled
enable-follower-tso-proxy - The PD server count is more than one.
- PD leader has been changed.
The root cause: The connection context has been cancelled, but the stream context is different from the connection context, so the stream will not cancel the pending request.
The root cause: The connection context has been cancelled, but the stream context is different from the connection context, so the stream will not cancel the pending request.
@bufferflies In golang if you cancel parent context (connection ctx in this case) then all children contexts are cancelled automatically (stream context in this case cctx, cancel := context.WithCancel(ctx))