python-driver icon indicating copy to clipboard operation
python-driver copied to clipboard

Delay for per-shard reconnection

Open dkropachev opened this issue 8 months ago • 2 comments

When node is restarted driver start opening connections to it. In case when there are lot's of driver instances it can impact cluster performance big deal. Examples: https://github.com/scylladb/scylla-enterprise/issues/5409

We need to make drivers to wait when they open shard connections.

dkropachev avatar Jun 03 '25 03:06 dkropachev

We can do select count(*) from system.clients and calculate pause from that.

I am strongly against this option. This already creates additional load on the cluster that we have no desire to add. There is no need to correlate anything with anyone - the driver itself, based on the number of shards and nodes it is already connected to, can calculate the desired delay (plus some random jitter ). For example, add 10ms per each already established connection, so if I'm already connected to 2 other nodes, with 7 shards each, I'll get 140ms delay + jitter.

mykaul avatar Jun 03 '25 06:06 mykaul

We can do select count(*) from system.clients and calculate pause from that.

I am strongly against this option. This already creates additional load on the cluster that we have no desire to add. There is no need to correlate anything with anyone - the driver itself, based on the number of shards and nodes it is already connected to, can calculate the desired delay (plus some random jitter ). For example, add 10ms per each already established connection, so if I'm already connected to 2 other nodes, with 7 shards each, I'll get 140ms delay + jitter.

I figured tht it would be better to just control concurrency of reconnections, instead of pulling cluster info and guessing correct backoff from it. Details here

dkropachev avatar Jun 04 '25 17:06 dkropachev

We are not planning to implement this feature, problem has been addressed on the server side.

dkropachev avatar Jan 19 '26 14:01 dkropachev