fleet icon indicating copy to clipboard operation
fleet copied to clipboard

Drift correction not working

Open lindhe opened this issue 1 year ago • 3 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

If an object is changed, Fleet detects the diff but does nothing to converge to a healthy state.

Expected Behavior

When spec.correctDrift.enabled=true, I expect Fleet to try and apply changes as soon as there is a diff.

Steps To Reproduce

  1. Have Rancher v2.8.1 installed.

  2. In Rancher, click "Continuous Delivery" and "Git Repos" and select the "fleet-local" workspace.

  3. Add a GitRepo that applies some resource. Make sure to check "Enable Self-Healing" to set spec.correctDrift.enabled=true in the bundle.

  4. Wait for the GitRepo to sync and become healthy, with the new resource created and in state "Ready".

  5. Edit the resource using kubectl edit, e.g. delete a label or something.

  6. Observe new state "Modified" for the resource:

    Screenshot 2024-05-16 171752

Environment

- Architecture: amd64
- Fleet Version: The one that's bundled with Rancher v2.8.1. 
- Cluster:
  - Provider: RKE2
  - Options: 3 nodes upstream cluster
  - Kubernetes Version: 1.27.9

Logs

No response

Anything else?

It looks like https://github.com/rancher/fleet/pull/1594 tried to implement drift correction, but it's clearly not working.

lindhe avatar May 16 '24 16:05 lindhe

Probably related to https://github.com/rancher/fleet/issues/2551

manno avatar Jul 03 '24 13:07 manno

I'm seeing the same behavior with rancher 2.8.5 / fleet 0.9.8.

image

jhoblitt avatar Sep 03 '24 22:09 jhoblitt

I've reproduced the problem with rancher 2.9.1 / 0.10.1 as well:

image

The fleet-agent logs on the cluster are the same messages repeated over and over again. E.g.:

{"level":"info","ts":"2024-09-04T22:15:40Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"c6f34409-8c39-4bfa-bb86-ff79d3028f46"}
{"level":"info","ts":"2024-09-04T22:15:41Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1033","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:42Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:42Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d"}
{"level":"info","ts":"2024-09-04T22:15:43Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","reconcileID":"c1678532-73f5-4929-9985-e50db5603133","deploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","release":"rook-ceph/rook-ceph-cluster:2","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055"}
{"level":"info","ts":"2024-09-04T22:15:43Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1034","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:44Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:44Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da"}
{"level":"info","ts":"2024-09-04T22:15:45Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","reconcileID":"494aa1f3-8e09-4790-ac4a-7acc6bc34b7f","deploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","release":"rook-ceph/rook-ceph-cluster:2","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055"}
{"level":"info","ts":"2024-09-04T22:15:45Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1035","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:46Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:46Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48"}
{"level":"info","ts":"2024-09-04T22:15:47Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1036","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:48Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:48Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b"}

jhoblitt avatar Sep 04 '24 22:09 jhoblitt

I have tried, and failed, to reproduce this against the current main by deleting a label on a config map. This needs further investigation. Could you share an example of a workload (GitRepo), or a known manifest or chart, which triggers this failure?

In any case, #2917 should reduce the noise compared to logs shared above.

weyfonk avatar Oct 07 '24 11:10 weyfonk

Cleaning up the backlog.

manno avatar Oct 23 '24 13:10 manno