postgres-operator
postgres-operator copied to clipboard
Support preferred pod antiaffinity
My organization is trying to harden our database against datacenter-scale outages, and would like to add zone-based anti-affinity to our database pods. However, it doesn't seem like there is any way to use a preferredDuringSchedulingIgnoredDuringExecution
antiaffinity rule rather than a required
one. This means that we would be unable to scale our number of pods past the number of availability zones in a region.
Would it be possible to add configuration options to the operator which would allow for preferred
rather than required
antiaffinity?
Hm, not super sure what's the right strategy here. In how many cases you really need more instances than zones? If one zone goes down and you end up with two instances in the same zone. Afaik, the operator does not perform a rebalancing of instances when all zones are back. But anyway, I could think of another configuration option to set the antiAffinity to required (default) or preferred.
Yeah it's mostly a request that's looking to the future - we don't currently need more instances than zones, but we might in the future, and the extra level of configurability would be useful in that case.
I agree that any option added should definitely default to required, as this use case is quite niche.
Another option would be to provide the ability in the CR to define the full podAntiAffinity block. That way consumers could specify whichever anti-affinity they want. https://www.verygoodsecurity.com/blog/posts/kubernetes-multi-az-deployments-using-pod-anti-affinity
Kubernetes 1.18 introduced topologySpreadConstraints, which would also be good to surface as a configuration option. https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/
I am very opinionated on not providing all these options in the CRD, straying away from having a easy to use Postgres CRD.
Many of these configurations are important and I would see configuration and implementation coming via the operator config map/operator config crd. But your Postgres user should not ideally worry about this.
You also would not imho want to change all Postgres manifests (spread accross Gits) just because you need to change e.g. your affinity rule.
At administration/operator-level or user/cluster-level anyway, it would be great to have the possibility to change the podAntiAffinity
"softness" and even better, support thePod
's topologySpreadConstraints
Yeah, this would be a great feature to have. It would be great to be able to set a soft-max of one pod per host as described here: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#never-co-located-in-the-same-node
This way if a host goes down, more than one pod won't get taken down if more than one pod was running on it. Especially because of this (https://github.com/zalando/postgres-operator/pull/1484) it is possible for one host to take down both master and sync replica, which would be very unfortunate and very bad luck but is possible.
Edit: https://github.com/zalando/postgres-operator/blob/v1.7.1/docs/administrator.md#enable-pod-anti-affinity Oops :sweat_smile:
Hi, any updates on this issue?
Chiming in to say I would like this solved as well, a simple toggle between Preferred
and Required
would be best. Currently, even something as basic as statefulset replacement is blocked if the number of nodes is equal to the number of postgres instances...
It is now included in the latest release