[SURE-9007] Pod Disruption Budgets
Fleet should support PCs and PDBs. They should taken from the existing AgentDeploymentCustomization struct.
see "RFC: Cattle Cluster Agent Priority Class And Pod Disruption Budgets"
@manno does this require UI changes ?
Reference implementation in Rancher https://github.com/rancher/rancher/issues/48995
QA Template
-
Setup Fleet on a cluster. Rancher is not required.
-
Edit the Fleet cluster resource, add
spec.agentSchedulingCustomizationlike so:agentSchedulingCustomization: priorityClass: value: 777 podDisruptionBudget: minAvailable: "1"This should result in the creation of a
PriorityClassnamedfleet-agent-priority-classand aPodDisruptionBudgetwith namefleet-agent-pod-disruption-budget. -
Make sure the Agent's Deployment was updated and that the pod is successfully running. If a
spec.priorityClassNameis configured and used in a Deployment before the PriorityClass is actually created, the Pod will hang. This must not happen.3. Check that the values offleet-agent-priority-classandfleet-agent-pod-disruption-budgetare according to the specification in the cluster resource. -
Check that the selectors of
fleet-agent-pod-disruption-budgetcorrectly point to the Fleet agent deployment. -
Check that the
spec.priorityClassNamefield of the Fleet agent deployment correctly points to the PriorityClass. The value forspec.Priorityin the Fleet agent deployment should reflect the configured value forpriorityClassin the Fleet cluster resource. -
Delete the
agentSchedulingCustomizationfield and ensure that the PriorityClass and PodDisruptionBudget resources have been removed and that the Deployment of the Fleet agent does not contain a reference tospec.priorityClassName. Optionally extend testing for other configurable values of thespec.agentSchedulingCustomizationfield (podDisruptionBudget.maxUnavailableorpriorityClass.preemptionPolicy). Setting both,minAvailableandmaxUnavailableon a PodDisruptionBudget is supposed to prevent the Fleet agent from being updated, as those values are mutually exclusive and cannot successfully be set on a PodDisruptionBudget.
Every change in the cluster resource is supposed to redeploy the downstream (or local) agent with the previous PriorityClass and PodDisruptionBudget being deleted and recreated. If a value is configured for PriorityClass, a reference to to the PriorityClass in the Fleet agent Deployment must exist.
Verified in Rancher 2.13.0-alpha2 with Fleet 0.14.0-alpha.3
Overall working ok when testing in normal condition with single cluster + 1 downstream clusgter
Tested
2- Check priorityClass has value 777 and pdb has value minAvailable: "1" with the below values:
agentSchedulingCustomization:
priorityClass:
value: 777
podDisruptionBudget:
minAvailable: "1"
3- Verified agent deployment is updated if changed priorityClass has value 333 and pdb has value minAvailable: "4"
agentSchedulingCustomization:
priorityClass:
value: 333
podDisruptionBudget:
minAvailable: "4"
4- Verified selectors of fleet-agent-pod-disruption-budget correctly point to fleet agent deployment:
spec:
minAvailable: 4
selector:
matchLabels:
app: fleet-agent
5- Verified the priorityClassName field of the Fleet agent deployment correctly points to the PriorityClass.
6- Verified after deletion of agentSchedulingCustomization the pc and pdb resources are removed and fleet deployment does not have any ref to priorityClassName. Verified fleet agent gets updated
https://github.com/user-attachments/assets/82c97eb4-ae4d-4069-9877-9be068d256ce
7- Verified the above points also work in downstream clusters
8 - Faulty values ends in error:
agentSchedulingCustomization:
priorityClass:
value: 21
podDisruptionBudget:
minAvailable: "b"
However, the UI failed to display an explanation of the error which can be found in the status of the yaml:
I will open a separate issue for this.