vcluster
vcluster copied to clipboard
pdb syncing is broken
What happened?
Sort description: Some operators such as the percona one, utilize the maxUnavilable field of the pdb. Unfortunately, the pdb controller in the undercluster can't work with that field at present as it doesn't have enough info to make it work.
More details: We get a SyncFailed with error message “services does not implement the scale subresource” from the pdb controller in the undercluster.
Basically, pods are getting synced down to the undercluster with a service as the owner. If the pdb uses maxUnavailable with any value, or minAvailable with a percentage set, the pdb controller will look up the owner of the pods and figure out how many there should be so it can calculate an absolute maxUnavailable. It can't do this because that info isn't synced down from the overcluster.
Some of that logic is described here: https://kubernetes.io/docs/tasks/run-application/configure-pdb/#arbitrary-controllers-and-selectors
What did you expect to happen?
pdb's work correctly
I think, since only the overcluster knows the real values of the requested replicas, vcluster will need to convert pdb's using maxUnavailable or percent based minAvailable to absolute minAvailable when syncing to the undercluster.
The undercluster should then have enough knowledge on how to properly protect the pods from evictions.
How can we reproduce it (as minimally and precisely as possible)?
create a deployment in the overcluster create a pdb in the overcluster setting maxUnavailable
do a status check on the pdb in the undercluster
Anything else we need to know?
No response
Host cluster Kubernetes version
$ kubectl version
# paste output here
Host cluster Kubernetes distribution
# Write here
vlcuster version
$ vcluster --version
# paste output here
Vcluster Kubernetes distribution(k3s(default)), k8s, k0s)
# Write here
OS and Arch
OS:
Arch:
@kfox1111 thanks for creating this issue! We'll take a look at this pretty soon.
any updates?
Hi, this made it to the top of my work queue, but was replaced by a slightly more urgent task. Will be working on this in the coming weeks :)
Any progress on this issue?
Any progress on this issue?
Hi, we've talked about a couple of approaches, but I haven't managed to get to the implementation yet!
@rohantmp I've noticed, that minAvailable also doesn't work with the same error if we use the percentage instead of integer values (e.g. 50% instead of 3)
Also I found out, that if I have two or more similar PDB's in different namespaces inside the vcluster, then after sync they are exactly the same in terms of the selector and the namespace if I check it from the main cluster (we don't use the multi-namespace mode). This leads to eviction errors like this one:
error when evicting pods/"test-v4-5449cf6559-rbf5m-x-test-service-x-test" -n "vc-test": This pod has more than one PodDisruptionBudget, which the eviction subresource does not support.