servicecomb-service-center icon indicating copy to clipboard operation
servicecomb-service-center copied to clipboard

Can service-center support highly concurrent service-registry?

Open surilli opened this issue 5 years ago • 5 comments

Firstly, service-center uses Txn interface of etcd clientv3. Etcd limit the max gap between apply and commit index to 5000, and a large number of mutexes may block highly concurrent service-registry requests. For one stress test client sending http request of service-registry continuously to a cluster of three service-center, it hits nearly 17000 TPS but declines rapidly failing to meet our expectation. Secondly, the maximum number of service is limited to 50000 and the maximum number of instance is set to 150000. This may limit the capacity for service and instance registry.

surilli avatar Jun 05 '19 10:06 surilli

server has rate limiting, if you has large number of registry action, some of it will be reject, you should has retry algorithom to register again in client side

tianxiaoliang avatar Jun 06 '19 01:06 tianxiaoliang

Can you describe your scene in detail? Easier to communicate based on the scene.why 50000 services

wangqj avatar Jun 08 '19 10:06 wangqj

defaultServiceLimit = 50000, defaultInstanceLimit = 150000 in servicecomb-service-center\server\plugin\pkg\quota\quota.go We wanna test the best performance of service-center especially about service and instance registry, for the sake of some abnormal situations like almost all the instances restart at the same time.

surilli avatar Jun 11 '19 08:06 surilli

But I think this is just for testing. Aactually, it can't reach this limit. The 150,000 instance number is a very large-scale application scenario.

You can modify the source code, just for testing

wangqj avatar Jun 12 '19 03:06 wangqj

The point why we raise this question is that the TPS of service or instance registry can only hit approximately 3000 in steady state, although it may reach a high point at the start time, which may cause timeout question in some transaction systems. In the meanwhile, we use benchmark to test etcd and find that the performance of txn may be closely related to the number of etcd-client regardless of the number of grpc connections. So we look into the source code of service-center and find only one etcd-client as well as one grpc connection. I wanna figure out whether there is some possibility to optimize the performance of service & instance registry by raising the number of etcd-client. Could you help me with this question? We also use pprof tool to analyze the critical path of service-center and find the call chain of registry is quite long, which may has some space to improve. By the way, we have modified the limit to complete our stress test. Thanks for your help!

surilli avatar Jun 14 '19 03:06 surilli