flux2 icon indicating copy to clipboard operation
flux2 copied to clipboard

Migration safety: Allow deletion of kustomization without removing managed objects (one level less than spec.prune)

Open cep21 opened this issue 2 years ago • 2 comments

Describe the bug

Our flux 1.0 setup is a monorepo with this setup

/releases/<cluster>/<namespace>/<helm-charts>

For example, we have one file in the path

/releases/prod/nginx-ingress/helmrelease-nginx.yaml

In that path we put a v1 helmrelease. As well as a file namespace.yaml that makes the namespace itself.

When I installed flux v2, I followed the directions on the migration guide.

One of those steps is this:

$ flux create kustomization app \
  --source=GitRepository/app \
  --path="./deploy" \
  --prune=true \
  --interval=10m

I ran this command instead, in order to sync the staging cluster.

$ flux create kustomization app \
  --source=GitRepository/app \
  --path="./releases/staging" \
  --prune=true \
  --interval=10m

Prune makes a lot of sense for me, since people have legitimate reasons to add or remove kubernetes objects and helm charts inside the main git repository.

Eventually I wanted to rename this "app" object to something else that makes sense for us (the name of the source git repository).

To do this I had to delete, then remake, the kustomization object. I had assumed that this would be similar to turning off flux v1. Unfortunately, the prune documentation contains this sentence: Garbage collection is also performed when a Kustomization object is deleted, triggering a removal of all Kubernetes objects previously applied on the cluster.

That's a very unfortunate behavior, as it removed everything in our entire cluster.

How do we safely use flux2 in our setup? I would like to manage everything inside git, including the --export output of flux create kustomization app . But it's very dangerous that the removal of a single object will delete the entire cluster.

Is it possible to keep the prune behavior of deleting objects if they are removed from git, but not have the kustomize object's removal itself delete any objects?

Steps to reproduce

Make a kustomization object then remove it.

Expected behavior

Would prefer not to delete my cluster.

Screenshots and recordings

No response

OS / Distro

EKS

Flux version

flux: v0.26.2

Flux check

< flux check ► checking prerequisites ✗ flux 0.26.2 <0.26.3 (new version is available, please upgrade) ✔ Kubernetes 1.21.5-eks-bc4871b >=1.20.6-0 ► checking controllers ✔ all checks passed

Git provider

No response

Container Registry provider

No response

Additional context

No response

Code of Conduct

  • [x] I agree to follow this project's Code of Conduct

cep21 avatar Feb 10 '22 23:02 cep21

If you wish to move objects from 1 kustomization to another, create your new kustomization, suspend the old one so it no longer reconciles. You should then be able to move over the kubernetes objects from the suspended kustomization to the new kustomization without any downtime as the new kustomization will take ownership of those resources.

(I asked the same question in a discussion here)

jack-evans avatar Feb 20 '22 09:02 jack-evans

This would be very helpful for us, simply because we think that even with proper procedures, human error can still occur. For example, there are many ways in which someone could fail to properly migrate, if they misread or fail to see errors in logs, in commits, etc. As a support tech, it's not uncommon for "easy-to-avoid" mistakes to happen when engineers are called at unexpected times but expected to act quickly. It's also a big risk for us to hope that all engineers with access to the Kustomization are aware of the risk (and always remember the risk) of deleting it without a proper migration first.

We've already had one accident where a Kustomization was deleted and all of our stuff, similar to OP, was also deleted. 😬 This was especially unexpected, considering that the --prune flag seemed to suggest behavior similar to Flux 1, which we have used for a long time prior to Flux 2. No one on our team was expecting it to mean that everything is deleted when the Kustomization is deleted. 😰

We could mess with RBAC rules to prevent the deletion of a Kustomization, but this will require an extensive re-work of our RBAC setup. It would be amazing if we could have 2 separate flags, one for garbage collection when resources are deleted from the repo, and one for "garbage collection" when the Kustomization itself is deleted. 😁

I would love to be able to perform this change myself, but I'm not sure that I can, so I wanted to share my thoughts on the topic. 😄

TheKLARKEN avatar Oct 12 '23 15:10 TheKLARKEN