client-go icon indicating copy to clipboard operation
client-go copied to clipboard

pd-client: share the connection when GetStoreMinResolvedTS from PD

Open AndreMouche opened this issue 2 years ago • 7 comments

Hi, currently, we try to GetStoreMinResolvedTS for all stores from PD every 2 seconds https://github.com/tikv/client-go/blob/719e6456f7d5d59d45b70b77a36b450b6ce41c86/tikv/kv.go#L602-L607 Here we use http request and do not share the connections. https://github.com/tikv/client-go/blob/719e6456f7d5d59d45b70b77a36b450b6ce41c86/util/pd.go#L89-L94 This will be a problem (too many connection handshakes on PD) when there are a large number of tikv and tidb instances. For example when we have 100 tikvs and 100 tidbs, then the connections for pd/api/v1/min-resolved-ts to PD would be 100*100/2=5000 in 1 seconds. this will be a great pressure for PD. Meanwhile, From the above code on L92, it is possible that the first address is not the PD leader, and then it will cause the increasement of PD followers connections/CPU too. Here comes the suggestion: Use GRPC(pd-client) to GetStoreMinResolvedTS instead of http-client, which will share the connections and check the pd's leader regularly

AndreMouche avatar Aug 02 '23 22:08 AndreMouche

@HuSharp Please take a look, thanks!

AndreMouche avatar Aug 02 '23 22:08 AndreMouche

related issue https://github.com/pingcap/tidb/pull/43737

AndreMouche avatar Aug 02 '23 22:08 AndreMouche

When enabling TLS, it would waste so much time on TLS handshake image

Connor1996 avatar Aug 02 '23 22:08 Connor1996

@AndreMouche @Connor1996 This is the reason why we need http api https://github.com/tikv/pd/issues/6386 I don't think removing it would be a better idea.

For example when we have 100 tikvs and 100 tidbs, then the connections for pd/api/v1/min-resolved-ts to PD would be 100*100/2=5000 in 1 seconds.

To focus on this issue, I tried to extend the api to get the stores together, which will reduce it to 1 * 100 / 2 = 50 Do you think this is a solution? PTAL, thx! pd: https://github.com/tikv/pd/pull/6880 client-go: https://github.com/tikv/client-go/pull/921

HuSharp avatar Aug 03 '23 00:08 HuSharp

@HuSharp Why not provide a gRPC API?

Connor1996 avatar Aug 03 '23 01:08 Connor1996

@HuSharp Why not provide a gRPC API?

Oh, I misunderstood you, I'll provide the grpc that got all the store's ts or support for long connections later.

HuSharp avatar Aug 03 '23 01:08 HuSharp