sidekick Loadbalance based on running requests

Loadbalance based on running requests

Open klauspost opened this issue 4 years ago • 8 comments

Current loadbalancing is purely round-robin.

However different requests create different loads which means that servers processing complex requests may be slower and some servers may be mostly idle.

As an alternative simply choosing between the servers with the fewest running requests.

warp uses alternative host selection with the following scheme:

Select the host with the fewest running requests.
If tied, select the host that has the longest time since last request finished.

This both gives a good distribution and will take individual server load into consideration.

Apr 15 '20 10:04 klauspost

Current loadbalancing is purely round-robin.

the load balancing requirement for sidekick was to be always purely random - that was the original intention with no heuristics-based requirement.

May 06 '20 19:05 harshavardhana

It should be up to the targets health-path to decide if more load can be put on the target or if it's to busy. Isn't this the reason why minio/health/ready endpoint is a good choice for Sidekick?

Feb 10 '21 08:02 aweisser

@harshavardhana That assumes that all requests are equal and all servers behave the same, and at least the first statement is always false, the second can be.

Feb 10 '21 12:02 klauspost

@klauspost Your approach assumes that there‘s only a single Sidekick instance, which is also not always true. Look at the Splunk use case for example. Imho only the Minio server can decide if it can take any more with respect to its internal heuristics.

Feb 10 '21 14:02 aweisser

@aweisser I may be overlooking something, but how does multiple instances affect this?

If all sidekicks are trying to keep the number of running requests equal for all servers that would be good load balancing in my book and not just "load distribution".

Feb 10 '21 15:02 klauspost

A sidekick instance can only work on heuristics that it can measure. Without getting heuristics from the S3 server and without sharing heuristics with other Sidekick instances, a local Sidekick process can only count its local requests without knowing what other requesting clients do.

As you said, not all requests are the same and only the S3 server instance know about the real load.

Imho it's all about the smartness of the health check. Maybe the minio/health/ready endpoint can be even smarter than counting its Go routines before responding with an HTTP 503 "I'm to busy, go away!" It may take the servers system load (RAM, CPU usage) or the saturation of its NICs into account.

This way a "naiv" (let's better call it "simple and bullet proof") round robin over "ready" S3 targets should do the job quite well. Together with smart health checks it becomes a qualitative load balancing and not just a load distribution.

Feb 10 '21 16:02 aweisser

@aweisser So you are saying because it doesn't know anything more than up/down, we should stick to an algorithm that keeps piling requests on to an overloaded/subpar performing server? Doesn't make of sense to me.

The number of requests is a perfectly valid balancing function. Instead of relying on collecting metrics which may or may not give an indication of load (you mention some, but they are no real indication of load), keeping track of active requests is completely passive and doesn't have to rely any metrics and also takes sidekick->minio network issues into account.

Feb 18 '21 17:02 klauspost

I'm sure you can find examples that speak for one or the other approach because there is no single source of truth in a distributed system and the number of requests is not the only metric that refers to "load".

Also just recognized that the /minio/health/ready probe is currently not counting Go routines anyway.

I was confused by the following gist https://gist.github.com/nitisht/0c11d8c670f565b58d930b526ba0f2ed that states, that the readiness probe returns HTTP 503 if more than 500 go routines are open.

Maybe you already had reasons at Minio to not do it this way or to change the readiness probe to be equivalent to the liveness probe ("always return HTTP 200 as long as the service is running").

My opinion still is, that a server side "readiness" check is relevant for a qualitative load balancing (in contrast to a dumb load distribution). Surely a smarter way than round robin from the client side is also nice to have.

Imho the question should be: Is it worth it to break KISS?

Feb 22 '21 12:02 aweisser

fixed in #98 and released.

Dec 13 '23 00:12 harshavardhana

sidekick sidekick copied to clipboard

Loadbalance based on running requests

sidekick
sidekick copied to clipboard