postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

Support preferred pod antiaffinity

Open QuantumBJump opened this issue 5 years ago • 8 comments

My organization is trying to harden our database against datacenter-scale outages, and would like to add zone-based anti-affinity to our database pods. However, it doesn't seem like there is any way to use a preferredDuringSchedulingIgnoredDuringExecution antiaffinity rule rather than a required one. This means that we would be unable to scale our number of pods past the number of availability zones in a region.

Would it be possible to add configuration options to the operator which would allow for preferred rather than required antiaffinity?

QuantumBJump avatar Dec 13 '19 12:12 QuantumBJump

Hm, not super sure what's the right strategy here. In how many cases you really need more instances than zones? If one zone goes down and you end up with two instances in the same zone. Afaik, the operator does not perform a rebalancing of instances when all zones are back. But anyway, I could think of another configuration option to set the antiAffinity to required (default) or preferred.

FxKu avatar Dec 20 '19 14:12 FxKu

Yeah it's mostly a request that's looking to the future - we don't currently need more instances than zones, but we might in the future, and the extra level of configurability would be useful in that case.

I agree that any option added should definitely default to required, as this use case is quite niche.

QuantumBJump avatar Dec 20 '19 16:12 QuantumBJump

Another option would be to provide the ability in the CR to define the full podAntiAffinity block. That way consumers could specify whichever anti-affinity they want. https://www.verygoodsecurity.com/blog/posts/kubernetes-multi-az-deployments-using-pod-anti-affinity

Kubernetes 1.18 introduced topologySpreadConstraints, which would also be good to surface as a configuration option. https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/

jeffhubLR avatar Oct 01 '20 21:10 jeffhubLR

I am very opinionated on not providing all these options in the CRD, straying away from having a easy to use Postgres CRD.

Many of these configurations are important and I would see configuration and implementation coming via the operator config map/operator config crd. But your Postgres user should not ideally worry about this.

You also would not imho want to change all Postgres manifests (spread accross Gits) just because you need to change e.g. your affinity rule.

Jan-M avatar Oct 02 '20 09:10 Jan-M

At administration/operator-level or user/cluster-level anyway, it would be great to have the possibility to change the podAntiAffinity "softness" and even better, support thePod's topologySpreadConstraints

maxgio92 avatar Dec 21 '21 10:12 maxgio92

Yeah, this would be a great feature to have. It would be great to be able to set a soft-max of one pod per host as described here: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#never-co-located-in-the-same-node

This way if a host goes down, more than one pod won't get taken down if more than one pod was running on it. Especially because of this (https://github.com/zalando/postgres-operator/pull/1484) it is possible for one host to take down both master and sync replica, which would be very unfortunate and very bad luck but is possible.

Edit: https://github.com/zalando/postgres-operator/blob/v1.7.1/docs/administrator.md#enable-pod-anti-affinity Oops :sweat_smile:

jonathon2nd avatar Dec 29 '21 17:12 jonathon2nd

Hi, any updates on this issue?

apena-pmy avatar Mar 07 '22 14:03 apena-pmy

Chiming in to say I would like this solved as well, a simple toggle between Preferred and Required would be best. Currently, even something as basic as statefulset replacement is blocked if the number of nodes is equal to the number of postgres instances...

lc-guy avatar Sep 01 '22 13:09 lc-guy

It is now included in the latest release

FxKu avatar Jan 30 '23 15:01 FxKu