gunicorn icon indicating copy to clipboard operation
gunicorn copied to clipboard

Choosing a number of workers in a distributed system scenario

Open andrii-korotkov-verkada opened this issue 1 year ago • 1 comments

Hey. I hope your day is going well. I've seen a recommendation to have a number of workers to be 2 * cores + 1. But in a setting of a distributed system with Kubernetes deployments with customizable number of requested cores this becomes trickier to choose. There are choices between larger pods with more workers but less of them vs smaller pods with less workers but more of them. Some examples of configurations include:

  • 1 requested cpu core per container and 2 or 3 workers.
  • 1 requested cpu core per container and 1 worker.
  • 0.5 requested cpu cores per container with 1 worker.
  • X requested cpu cores per container with 2 * X + 1 workers.
  • X workers and tuned requested cpu based on the actual load.

Choice with 1 worker offers most flexibility in rightsizing the number of pods, but also may have a bit more overhead due to having a master process. Also, due to re-creation of worker after max requests there can be some downtime. Choice with many workers avoids some of the problems above, but also only allows to scale in bigger units and can lead to overprovisioning in regions where there's little traffic (like cpu utilization can be low even with min replicas set to 3 for availability reasons).

What's the best choice here? Thank you.

andrii-korotkov-verkada avatar Feb 21 '24 23:02 andrii-korotkov-verkada

I've ended up with an approach to use 2 workers and tune the cpu requests as appropriate.

andrii-korotkov-verkada avatar Mar 12 '24 22:03 andrii-korotkov-verkada

i don't really see the point there. Consider your container or pod as a single webserver instance. Then what matters is rather the location of this container to ensure you will be resilient across your system. One instance per web app. That the easiest schema.

benoitc avatar May 21 '24 20:05 benoitc