clickhouse-go icon indicating copy to clipboard operation
clickhouse-go copied to clipboard

Better load balancing for server connections

Open yusufozturk opened this issue 1 year ago • 3 comments

As I see in the current V2 code, every time before there is an operation, this library runs "ch.acquire" to get an active clickhouse-server. I believe we can gain some performance in latency if we move this process to outside of the queries.

So basically a dedicated goroutine, first adds all servers into a pool, and then monitors the every endpoints and if there is an issue with a server, so it can remove from the pool.

In this case, ch.acquire only gets one of the active servers from the pool and executes the operation to that machine. We are doing a similar operation in our load balanced API gateway and it handles thousands of requests per seconds.

yusufozturk avatar Sep 30 '22 13:09 yusufozturk

Are you proposing changing how the pool works or allowing a specific connection to be used? Selecting a connection from the idle pool should be cheap - you can always increase the size of this in heavy workloads. Connections are only created if the idle pool is empty.

How would you prefer? fixed pool size?

gingerwizard avatar Sep 30 '22 15:09 gingerwizard

I'm proposing changing how the pool works actually. But it's good idea to give people to use a specific group of Clickhouse machines by implementing some kind of workgroup or priority based model maybe?

Since Clickhouse Cloud is serverless solution at the moment, maybe it's not a big thing for CH customers. But for on-prem users, they might want to dedicate some of the Clickhouse servers just for read purposes. Our another scenario might be having 2 different regions of group of Clickhouse servers. And trying first region of servers and if there is no available server in that region, redirecting the request to other region.

I'm just brainstorming. Maybe these are too much work for this driver. But load balancing might work as an external plugin (like ch-go) and this plugin would only supply server for the request. Users can implement their own load balancing or contribute to other one directly. Still just brain storming.. :)

Thanks.

yusufozturk avatar Sep 30 '22 18:09 yusufozturk

There was a request to expose how connections are selected from the pool. Maybe this would solve your requirement? i.e. allow the acquire function to be custom?

gingerwizard avatar Oct 03 '22 09:10 gingerwizard

@gingerwizard I want allow the acquire function to be custom. My project clickhouse_sinker make a connection pool which is shard-aware. If writing fail with retryable error(such as ZooKeeper session expire), the user retry another replica inside the same shard. And there's per-replica connections limit. Retrying inside the same shard helps to leverage ReplicatedMergeTree deduplication feature.

yuzhichang avatar Nov 01 '22 11:11 yuzhichang

Agree we need to do this @yuzhichang

gingerwizard avatar Nov 01 '22 18:11 gingerwizard