helm-controller icon indicating copy to clipboard operation
helm-controller copied to clipboard

Changing targetNamespace causes orphaned HelmRelease

Open mbrancato opened this issue 3 years ago • 5 comments

Describe the bug

If after deploying a HelmRelease object the spec.targetNamespace is later changed, the flux controller will create a new deployment, etc in the newly specified namespace. However, it fails to delete the old components. If the HelmRelease is later deleted, Flux orphans the original deployment and does not delete it.

To Reproduce

Steps to reproduce the behaviour:

Install a HelmRelease:

---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: myapp
  namespace: flux-system
spec:
  chart:
    spec:
      chart: myapp
      sourceRef:
        kind: HelmRepository
        name: myrepo
      version: 0.1.0
  interval: 1m0s
  targetNamespace: mynamespace

Commit the config to git and let Flux deploy.

Later, change the spec.targetNamespace value:

---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: myapp
  namespace: flux-system
spec:
  chart:
    spec:
      chart: myapp
      sourceRef:
        kind: HelmRepository
        name: myrepo
      version: 0.1.0
  interval: 1m0s
  targetNamespace: mynewnamespace

Commit to git or apply ad-hoc to the cluster.

Expected behavior

Flux should create the deployment components in the new namespace, but it should also delete the components from the old namespace.

Additional context

I first hit this with v0.6.1, but I upgraded to v0.9.0 and the issue still exists.

  • Kubernetes version: 1.17.15
  • Git provider: Github
  • Container registry provider: GCR

Below please provide the output of the following commands:

flux --version
flux check
kubectl -n <namespace> get all
kubectl -n <namespace> logs deploy/source-controller
kubectl -n <namespace> logs deploy/kustomize-controller
% flux --version
flux version 0.9.0
% flux check
► checking prerequisites
✔ kubectl 1.19.3 >=1.18.0-0
✔ Kubernetes 1.17.15-gke.800 >=1.16.0-0
► checking controllers
✔ helm-controller: healthy
► ghcr.io/fluxcd/helm-controller:v0.8.0
✔ kustomize-controller: healthy
► ghcr.io/fluxcd/kustomize-controller:v0.9.1
✔ notification-controller: healthy
► ghcr.io/fluxcd/notification-controller:v0.9.0
✔ source-controller: healthy
► ghcr.io/fluxcd/source-controller:v0.9.0
✔ all checks passed

mbrancato avatar Feb 26 '21 15:02 mbrancato

Do you expect the garbage collection to happen before or after the install in the other namespace? I can imagine the effect can be quite different for some charts.

hiddeco avatar Mar 18 '21 22:03 hiddeco

Ideally it would create the new resources in the new namespace then delete the ok'd resources. That is only because we presume that the services need to stay available. But I agree that could cause unintended consequences, and presumably if you're changing namespaces you understand that might cause a service disruption. I think the safe way might be to delete the old chart first, then install the new one.

mbrancato avatar Mar 18 '21 23:03 mbrancato

The way this would work is that the <namespace>/<releaseName> would be recorded, and then compared to the config in the HelmRelease every reconciliation. This means that the feature does not just work for namespace changes, but also e.g. releaseName changes.

Given this, I am concerned about the impact it may have when people accidentally edit the wrong fields. It would therefore likely require some flag to be set before the behavior is triggered.

Another concern I have is with replacing a release in the same namespace, if e.g. there is some secret name configured in the spec.values it would likely not be possible to install the release before truncating the other one, as this would result in a rendered manifests contain a resource that already exists error.

hiddeco avatar Mar 19 '21 11:03 hiddeco

This issue was raised in today's Flux 🐛 scrub. Since there's not an obvious answer to how to handle this, I created a discussion from this issue: https://github.com/fluxcd/flux2/discussions/1691

@mbrancato @hiddeco would you be willing to summarize your thoughts there, and we will also ask other end users for additional feedback?

scottrigby avatar Aug 05 '21 19:08 scottrigby

To tackle the change in the targetNamespace field, my opinion is that the best option would be to recognize this change, create the new resources, and then delete the old resources once successfully changed. This only works for all namespaced resources. When it comes to global resources like CRDs, PV, Ingress, etc. - if these are present, they could be left in place. But if there is an unresolved conflict, Flux/helm-controller could always fail and explain to the user they must first delete the existing deployment. Delete before create would also work as an initial support for this prior to doing much inspection of the deployment to identify namespaced vs non-namespaced resources.

It may also be worthwhile to add a field that lets the deployment be configured to delete before create when required. If this field is set, Flux/helm-controller could go ahead and delete the old version and install the new version.

mbrancato avatar Aug 16 '21 11:08 mbrancato