kustomize-controller icon indicating copy to clipboard operation
kustomize-controller copied to clipboard

Prevent two kustomizations from managing the same object

Open BeryJu opened this issue 1 year ago • 5 comments

While migrating some clusters to Flux, we accidentally had some deployments managed by two kustomizations. They both had different versions in the deployment, so every 10 minutes when flux would reconcile the kustomizationA it would do a rolling restart to versionA, then switch to versionB when kustomizationB was reconciling.

The kustomization-controller could check the kustomize.toolkit.fluxcd.io/name label and if it's set and not equal to the current customization, throw an error.

Running Flux version v0.28.4, kustomize controller v0.22.2 (but I don't think this behaviour changes in later versions), Kubernetes v1.21.14 (kubeadm)

P.S. I'm also torn if this is something flux should check/care about, and maybe this should be solved with a CI check/a CLI tool

BeryJu avatar Sep 27 '22 12:09 BeryJu

The kustomization-controller could check the kustomize.toolkit.fluxcd.io/name label and if it's set and not equal to the current customization, throw an error.

IMO such a thing would make Flux unusable. How would you restructure repos? Would you delete the workloads from production just to be able to move them around in Git?

stefanprodan avatar Sep 27 '22 12:09 stefanprodan

Yeah that's a good point, it's kinda required that flux adopts that label when things are moved around.

Maybe a better approach would be to have a Prometheus metric with the number of objects per kustomization? All though I don't think that would fully fix this either, as that metric would only be updated when a Kustomization reconciles and would thus not be accurate either.

Or maybe a metric that shows how many objects have been updated from a Kustomization, with that alerting would be a lot easier; if that value is continuously high instead of spiking when updates are pushed then something is not right.

BeryJu avatar Sep 27 '22 12:09 BeryJu

If you setup Flux notifications it would be obvious what’s going on, as you’ll receive 2 notifications for the same resources from 2 different Kustomizations.

stefanprodan avatar Sep 27 '22 12:09 stefanprodan

Good idea, despite having contributed to the notification controller I didn't think to use it like this

For the metric, how would you feel about a PR that adds such a metric for kustomizations (i.e. count of objects updated/created/deleted, updated after the reconciliation)?

BeryJu avatar Sep 27 '22 13:09 BeryJu

All Flux CRDs expose a common set of metrics, we'll need an RFC to introduce custom metrics for each controller and what's the value in doing this in regards to maintenance burden.

stefanprodan avatar Sep 27 '22 14:09 stefanprodan

We solved this issue by using the notification controller, which shows us if any objects are in a reconciliation loop. I'll close this issue

P.S. thanks for your great work on flux @stefanprodan

BeryJu avatar Nov 07 '22 13:11 BeryJu