Unstable gitrepo ready condition for multiple non-ready bundles
Is there an existing issue for this?
- [x] I have searched the existing issues
Current Behavior
when there is more than one bundle with "ready: false" state, gitrepos objects will be updated frequently (creating a new resource version) with a "new" message from a different bundle This causes unnecessary activity at the control plane, eventually slowing down Rancher Continuous Delivery UI
Expected Behavior
the same gitrepo status is produced if there is no changes on the bundle states
Steps To Reproduce
- Installing Rancher v2.10.3, running Fleet 0.11.4
- Create a GitRepo with more than one bundle in error state
`kind: GitRepo
apiVersion: fleet.cattle.io/v1alpha1
metadata:
name: bundledependency
namespace: fleet-default
spec:
repo: https://github.com/rbreddy/bundledependency
branch: main
targets:
- clusterSelector: matchExpressions: - key: provider.cattle.io operator: NotIn values: - harvester`
- To analyze it: leave it running for a while and then watch gitrepos, e.g.:
for kind in gitrepos bundles bundledeployments; do { kubectl get -A --show-managed-fields --chunk-size=0 --watch-only --output-watch-events -o yaml $kind >$kind-watch-only-events.yaml & pid=$! sleep 180 kill $pid } & done wait - Parse the gitrepos-watch-only-events.yaml with: https://gist.github.com/aruiz14/b58fcc96fde894cbf85562e888d8e1bd
Environment
- Architecture: amd64
- Fleet Version:v0.11.4
Logs
Example of frequent updates on the message
@@ -100,8 +100,8 @@
status:
commit: 124109ec64e6c2ef5de39cd7a704bda6e2d4b49e
conditions:
- - lastUpdateTime: "2025-03-27T09:47:24Z"
- message: 'ErrApplied(1) [Cluster fleet-default/downstream-0-0: list bundledeployments: no bundles matching labels fleet.cattle.io/bundle-name=logging-logging-crd,fleet.cattle.io/bundle-namespace=fleet-default in namespace fleet-default]'
+ - lastUpdateTime: "2025-03-27T09:46:20Z"
+ message: 'ErrApplied(1) [Cluster fleet-default/downstream-0-0: cannot patch "nginx-deployment" with kind Deployment: Deployment.apps "nginx-deployment" is invalid: [spec.template.metadata.labels: Invalid value: map[string]string{"app":"nginx-rancher"}: `selector` does not match template `labels`, spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"nginx-rancher123"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable]]'
status: "False"
type: Ready
- lastUpdateTime: "2025-03-27T09:30:02Z"
--- /dev/fd/63 2025-03-27 12:57:55.178285163 +0100
+++ /dev/fd/62 2025-03-27 12:57:55.178285163 +0100
@@ -81,10 +81,10 @@
manager: fleetcontroller
operation: Update
subresource: status
- time: "2025-03-27T11:34:40Z"
+ time: "2025-03-27T11:34:56Z"
name: rafa
namespace: fleet-default
- resourceVersion: "47504"
+ resourceVersion: "47577"
uid: 1786f5c8-2233-48e7-b5fc-f00625245612
spec:
branch: main
@@ -123,7 +123,7 @@
readyBundleDeployments: 2/7
state: ErrApplied
gitJobStatus: Current
- lastPollingTriggered: "2025-03-27T11:34:40Z"
+ lastPollingTriggered: "2025-03-27T11:34:55Z"
observedGeneration: 3
readyClusters: 0
resourceCounts:
--- /dev/fd/63 2025-03-27 12:57:55.246285497 +0100
+++ /dev/fd/62 2025-03-27 12:57:55.246285497 +0100
@@ -84,7 +84,7 @@
time: "2025-03-27T11:34:56Z"
name: rafa
namespace: fleet-default
- resourceVersion: "47577"
+ resourceVersion: "47578"
uid: 1786f5c8-2233-48e7-b5fc-f00625245612
spec:
branch: main
@@ -100,8 +100,8 @@
status:
commit: 124109ec64e6c2ef5de39cd7a704bda6e2d4b49e
conditions:
- - lastUpdateTime: "2025-03-27T09:46:20Z"
- message: 'ErrApplied(1) [Cluster fleet-default/downstream-0-0: cannot patch "nginx-deployment" with kind Deployment: Deployment.apps "nginx-deployment" is invalid: [spec.template.metadata.labels: Invalid value: map[string]string{"app":"nginx-rancher"}: `selector` does not match template `labels`, spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"nginx-rancher123"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable]]'
+ - lastUpdateTime: "2025-03-27T09:47:40Z"
+ message: 'ErrApplied(1) [Cluster fleet-default/downstream-0-0: list bundledeployments: no bundles matching labels fleet.cattle.io/bundle-name=longhorn-crd,fleet.cattle.io/bundle-namespace=fleet-default in namespace fleet-default]'
status: "False"
type: Ready
- lastUpdateTime: "2025-03-27T09:30:02Z"
Anything else?
No response
/backport v2.11.1
/backport v2.10.5
QA Template
Solution
Sort Bundles before selection.
Testing
(from the reproduction steps in the description)
- Create a GitRepo with more than one bundle in error state
kind: GitRepo
apiVersion: fleet.cattle.io/v1alpha1
metadata:
name: bundledependency
namespace: fleet-default
spec:
repo: https://github.com/rbreddy/bundledependency
branch: main
targets:
- clusterSelector:
matchExpressions:
- key: provider.cattle.io
operator: NotIn
values:
- harvester
- Once deployed, disable polling to avoid possible noise.
- Watch GitRepo
Ready.status.condition: after these changes, it should be stable and always mention the same Bundle as the cause for not being ready.
After following steps from comment: https://github.com/rancher/fleet/issues/3484#issuecomment-2915442721