helm-operator
helm-operator copied to clipboard
Constantly creating new releases for charts even when no changes
Describe the bug
Hello.
I am starting experimenting with Flux and the Helm Operator on a new Cluster and everything went fine until I deployed cert-manager Helm chart.
Each time the sync runs, the helm operator tries to do an update and create a new release even without any change in the chart.
This is causing some instability in the Network of my cluster. (maybe to do excessive load in the API server resulted from the constant updates.
What is strange is that if I run: kubectl -n cert-manager get helmreleases.helm.fluxcd.io the latest update date is the initial deploy. Still a new secret with the helm release information is being created every time and the pods restarted.
To Reproduce
Just a basic helm release manifest:
---
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
name: cert-manager
namespace: cert-manager
spec:
releaseName: cert-manager
chart:
repository: https://charts.jetstack.io
name: cert-manager
version: 0.15.1
values:
installCRDs: true
global:
leaderElection:
namespace: cert-manager
ingressShim:
defaultIssuerName: letsencrypt-prod
defaultIssuerKind: ClusterIssuer
prometheus:
enabled: false
Expected behavior
Cert-manager should only be deployed when the is any change.
Logs
Here is the log output:
´s=2020-06-09T11:27:43.485024459Z caller=helm.go:69 component=helm version=v3 info="checking 31 resources for changes" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.511728151Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ServiceAccount \"cert-manager\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.52932945Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ServiceAccount \"cert-manager-webhook\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.593238042Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"certificaterequests.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.707520153Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"certificates.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.787600969Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"challenges.acme.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.840798706Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"clusterissuers.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.914200336Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"issuers.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.976102766Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"orders.acme.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.006756511Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-issuers\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.170684941Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-clusterissuers\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.186528407Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-certificates\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.206401365Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-orders\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.226948001Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-challenges\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.242952126Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-ingress-shim\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.270264045Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-view\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.288023181Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-edit\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.323437971Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-issuers\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.334162638Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-clusterissuers\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.349806345Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-certificates\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.364670113Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-orders\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.42316236Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-challenges\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.442380377Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-ingress-shim\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.457693993Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Role \"cert-manager:leaderelection\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.471976952Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Role \"cert-manager-webhook:dynamic-serving\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.509698246Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Service \"cert-manager-webhook\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.534244597Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Deployment \"cert-manager\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.550004375Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Deployment \"cert-manager-webhook\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:45.300454292Z caller=helm.go:69 component=helm version=v3 info="updating status for upgraded release for cert-manager" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:46.000638151Z caller=release.go:309 component=release release=cert-manager targetNamespace=cert-manager resource=cert-manager:helmrelease/cert-manager helmVersion=v3 info="upgrade succeeded" revision=0.15.1 phase=upgrade
Sometimes I also found these errors:
warning="failed to annotate release resources: serviceaccount/cert-manager annotated" phase=annotate
Not exactly sure what this is but it seems to take some time to run.
Additional context
- Helm Operator version: 1.1.0 (installed with the Helm chart)
- Kubernetes version: 1.16.8 (Digital Ocean)
It seems version 1.1.0 constantly creates new releases for all HelmRelease definitions. Downgrading to 1.0.2 (i.e. the Helm version of helm-operator) resolved the issue for me.
I'm facing the similar issue with strimizi kafka operator. With 1.0.2 works well, but with 1.1.0 it keeps sync the chart even though there is no change.
I'm seeing the same issue with version 1.1.0 and cert-manager 0.15.0
We're seeing the same issue with v1.1.0 and cert-manager v0.15.0.
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION cert-manager cert-manager 244 2020-06-19 08:20:30.927272479 +0000 UTC deployed cert-manager-v0.15.0 v0.15.0
We had the same issue, but in our case this was caused by too low memory limits for the helm operator. This has caused it to be restarted by kubernetes. After a new start of the helm operator it has to download all the charts again. This caused the chart.changed value set to true for all charts which then causes the Upgrade action to be called instead of the dry-run action.
I would prefer if the dry-run action would be called here.
I'm also seeing this problem with some custom charts. Each reconcile iteration seems to create a new release upgrade even though nothing changes.
EDIT: Figured it out after doing some debugging. It looks like if we use relative chart versions e.g. ~> 2.0 in the release it will break the comparison here since the value of hr.Status.LastAttemptedRevision will not be the resolved version of the chart but rather the raw ~> 2.0 string and it will obviously fail, causing the operator to assume there's always a new version of the chart.
My issue doesn't seem to be related to the cert-manager issue described here, I think it would deserve its own ticket.
EDIT2: Relevant issue #490 (resolved in v1.2.0, my comment can be ignored).
I am also having this issue with a lot of charts, info can be found in slack. Can we have a maintainer chime in or at least rename this issue? It can happen on any random chart from what I've seen.
I am not seeing the helm-operator pod being restarted like @twendt has.
https://cloud-native.slack.com/archives/CLAJ40HV3/p1597320334119100
@onedr0p I have renamed the issue to be more generic.
Still, we need an official response. This issue is open for 2 months without any feedback and I think it´s quite critical. The constant releases killed my cluster.
I want to fully dive into GitOps, but this issue open for so long without any feedback doesnt give much confidence.
@stefanprodan, can some maintainer look at this, please?
I too observed this today, up to revision 2291(!) of a Helm-operator (v1.2.0) controlled HelmRelease
Same here. I have just completed the 'get-started' tutorial and the demo apps are upgraded every 5 minutes:
ts=2020-08-17T02:37:31.935043189Z caller=helm.go:69 component=helm version=v3 info="checking 6 resources for changes" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.94590984Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Secret \"redis\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.955831337Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ConfigMap \"redis\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.97589111Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ConfigMap \"redis-health\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.985966338Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Service \"redis-headless\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.994606993Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Service \"redis-master\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:32.056348166Z caller=helm.go:69 component=helm version=v3 info="updating status for upgraded release for redis" targetNamespace=demo release=redis
ts=2020-08-17T02:37:32.213089037Z caller=release.go:364 component=release release=redis targetNamespace=demo resource=demo:helmrelease/redis helmVersion=v3 info="upgrade succeeded" revision=10.3.1 phase=upgrade
Helm list:
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
mongodb demo 22 2020-08-17 02:40:31.151206363 +0000 UTC deployed mongodb-7.6.3 4.0.14
redis demo 22 2020-08-17 02:40:31.296983141 +0000 UTC deployed redis-10.3.1 5.0.7
Sorry for the late response all, last months have been hectic in terms of workload and GitOps Toolkit developments, and I was enjoying time off the last two weeks.
I tried to reproduce the issue with version 1.2.0 of the Helm Operator with both the Redis HelmRelease example @eklee reported and the cert-manager HelmRelease in the issue, and was unable to observe any spurious upgrades.
@davidholsgrove may it be possible that the revision drift up to 2291 was due to the misbehaving 1.1.0 version, and thus fixed by #490? If not, can you all please share the Status object of the misbehaving HelmRelease? This would give me better insights in why it may happen.
@hiddeco I've been using helm-operator v1.2.0 since 10th August, and only noticed the run away upgrades this week. Its occurring in 3 separate k8s clusters (all using fluxcd 1.4.0 / helm-operator 1.2.0, with separate backing git repos of HelmReleases).
I'm using fixed chart versions, so not the same as https://github.com/fluxcd/helm-operator/issues/469
Ive killed the fluxcd and helm-operator pods in each cluster to stop the helm history being trashed.
Cluster 1
prometheus-operator and helm-operator continually upgrading;
$ helm ls -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
flux fluxcd 29 2020-08-09 22:20:49.084079856 +0000 UTC deployed flux-1.4.0 1.20.0
helm-operator fluxcd 1032 2020-08-12 04:25:04.433300225 +0000 UTC deployed helm-operator-1.2.0 1.2.0
prometheus-operator monitoring 497 2020-08-13 07:54:39.642859 +1000 AEST deployed prometheus-operator-9.3.1 0.38.1
Upgrade occurring every 3 minutes (the last one manually, after helm-operator had been stopped for a day):
$ helm -n monitoring history prometheus-operator
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
488 Wed Aug 12 04:01:19 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
489 Wed Aug 12 04:04:09 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
490 Wed Aug 12 04:07:10 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
491 Wed Aug 12 04:10:11 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
492 Wed Aug 12 04:13:18 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
493 Wed Aug 12 04:16:15 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
494 Wed Aug 12 04:19:14 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
495 Wed Aug 12 04:22:09 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
496 Wed Aug 12 04:25:18 2020 superseded prometheus-operator-9.3.1 0.38.1 Upgrade complete
497 Thu Aug 13 07:54:39 2020 deployed prometheus-operator-9.3.1 0.38.1 Upgrade complete
$ k -n monitoring describe hr prometheus-operator
Name: prometheus-operator
Namespace: monitoring
Labels: fluxcd.io/sync-gc-mark=sha256.OfeOgFbdT4R06Z7OMl2uT9Wjy5tc0SdI9v8JqxaoqeQ
Annotations: fluxcd.io/sync-checksum: 69091ad8e7fe97e3926e9d2256a4a42f1d87d459
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"helm.fluxcd.io/v1","kind":"HelmRelease","metadata":{"annotations":{"fluxcd.io/sync-checksum":"69091ad8e7fe97e3926e9d2256a4a...
API Version: helm.fluxcd.io/v1
Kind: HelmRelease
Metadata:
Creation Timestamp: 2020-07-22T07:04:18Z
Generation: 3
Resource Version: 28654611
Spec:
Chart:
Name: prometheus-operator
Repository: https://kubernetes-charts.storage.googleapis.com
Version: 9.3.1
Helm Version: v3
Release Name: prometheus-operator
[--snip--]
Status:
Conditions:
Last Transition Time: 2020-07-22T07:04:49Z
Last Update Time: 2020-08-12T04:25:10Z
Message: Chart fetch was successful for Helm release 'prometheus-operator' in 'monitoring'.
Reason: ChartFetched
Status: True
Type: ChartFetched
Last Transition Time: 2020-08-09T22:21:57Z
Last Update Time: 2020-08-12T04:25:50Z
Message: Release was successful for Helm release 'prometheus-operator' in 'monitoring'.
Reason: Succeeded
Status: True
Type: Released
Last Attempted Revision: 9.1.1
Observed Generation: 3
Phase: Succeeded
Release Name: prometheus-operator
Release Status: deployed
Revision: 9.3.1
Events: <none>
Cluster 2
gitlab continually upgrading;
$ helm ls -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
flux fluxcd 29 2020-07-31 11:11:32.259345348 +0000 UTC deployed flux-1.4.0 1.20.0
gitlab gitlab 2291 2020-08-17 01:24:42.888677627 +0000 UTC deployed gitlab-4.2.4 13.2.4
helm-operator fluxcd 29 2020-08-10 11:23:56.7837415 +1000 AEST deployed helm-operator-1.2.0 1.2.0
$ helm -n gitlab history gitlab
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
2283 Mon Aug 17 01:00:58 2020 superseded gitlab-4.2.4 13.2.4 Upgrade complete
2284 Mon Aug 17 01:03:42 2020 superseded gitlab-4.2.4 13.2.4 Upgrade complete
2285 Mon Aug 17 01:06:42 2020 superseded gitlab-4.2.4 13.2.4 Upgrade complete
2286 Mon Aug 17 01:09:43 2020 superseded gitlab-4.2.4 13.2.4 Upgrade complete
2287 Mon Aug 17 01:12:33 2020 superseded gitlab-4.2.4 13.2.4 Upgrade complete
2288 Mon Aug 17 01:15:38 2020 superseded gitlab-4.2.4 13.2.4 Upgrade complete
2289 Mon Aug 17 01:18:36 2020 superseded gitlab-4.2.4 13.2.4 Upgrade complete
2290 Mon Aug 17 01:21:41 2020 superseded gitlab-4.2.4 13.2.4 Upgrade complete
2291 Mon Aug 17 01:24:42 2020 deployed gitlab-4.2.4 13.2.4 Upgrade complete
2292 Mon Aug 17 01:27:37 2020 pending-upgrade gitlab-4.2.4 13.2.4 Preparing upgrade
$ k -n gitlab describe hr gitlab
Name: gitlab
Namespace: gitlab
Labels: fluxcd.io/sync-gc-mark=sha256.tUcmp_UAo3ET0QtAeI2CG0-lgb2ZYTRDRYWgENMW2xI
Annotations: fluxcd.io/sync-checksum: 627e765ba04176a6940be115cb3d686eb4b965f3
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"helm.fluxcd.io/v1","kind":"HelmRelease","metadata":{"annotations":{"fluxcd.io/sync-checksum":"627e765ba04176a6940be115cb3d6...
API Version: helm.fluxcd.io/v1
Kind: HelmRelease
Metadata:
Creation Timestamp: 2020-07-07T07:57:21Z
Generation: 15
Resource Version: 36550979
Spec:
Chart:
Name: gitlab
Repository: https://charts.gitlab.io
Version: 4.2.4
Helm Version: v3
Release Name: gitlab
[--snip--]
Status:
Conditions:
Last Transition Time: 2020-07-31T11:13:15Z
Last Update Time: 2020-08-17T01:25:22Z
Message: Release was successful for Helm release 'gitlab' in 'gitlab'.
Reason: Succeeded
Status: True
Type: Released
Last Transition Time: 2020-07-10T00:41:46Z
Last Update Time: 2020-08-17T01:27:23Z
Message: Chart fetch was successful for Helm release 'gitlab' in 'gitlab'.
Reason: ChartFetched
Status: True
Type: ChartFetched
Last Attempted Revision: 4.1.4
Observed Generation: 15
Phase: ChartFetched
Release Name: gitlab
Release Status: pending-upgrade
Revision: 4.2.4
Events: <none>
Hi there. I reported https://github.com/fluxcd/helm-operator/issues/469 in 1.1.0 but we are now running the same issue for all releases in 1.2.0, just like most of people here reported. Rolling back to 1.0.1 :(
The problem seems to be that the LastAttemptedRevision is not set to the right version (but stuck on an older version), which is later used to determine if the release needs to be upgraded. This gives me sufficient information to work on a fix, but KubeCon is in the way today.
I will try to have a prerelease ready for you by tomorrow.
Still unsuccessful in replicating the issue where the LastAttemptedRevision is not updated, even when I try to replicate it by jumping from 1.0.1 -> 1.1.0 -> 1.2.0 (while performing version upgrades for the HelmRelease in the meantime).
Given you all seem to have installed the helm-operator using Helm itself, can you please provide me with the output of kubectl get crd helmreleases.helm.fluxcd.io -o yaml, as I have a suspicion it may be due to Helm not performing upgrades for CRDs while a field has been added to the status field (since >=1.1.0).
(Another option would be to kubectl apply -f https://raw.githubusercontent.com/fluxcd/helm-operator/v1.2.0/deploy/crds.yaml, and see if the problem goes away).
Thanks @hiddeco - looks like it was the version of the CRD wasn't updated and caused the run away helm-operator upgrades.
The HelmOperator chart option createCRD=true - is that the "right" way to go when we have HelmOperator managing the HelmRelease for HelmOperator?
Previously I had a (stale) version of the CRD in the git repo my FluxCD enforces. Would be good if HelmOperator had an initcontainer or other check and refused to start if its CRD was the wrong version maybe?
The HelmOperator chart option
createCRD=true- is that the "right" way to go when we have HelmOperator managing the HelmRelease for HelmOperator?
The right way is to not install the CRDs using the Helm chart, but apply it manually / synchronize it using Flux (as written out in the install instructions).
Previously I had a (stale) version of the CRD in the git repo my FluxCD enforces. Would be good if HelmOperator had an initcontainer or other check and refused to start if its CRD was the wrong version maybe?
If possible, that would likely be an improvement, but I do not think it will be implemented at this time (or in the near future) as we are working on a next-gen helm-controller that will eventually replace the Helm operator.
@hiddeco we've been having this issue on v1.1.0 with the correct CRD managed by Flux. Are you saying that this issue affects v1.1.0 and v1.2.0 if the CRD isn't updated?
After applying the CRD for v1.2.0, I am happy to report helm operator v1.2.0 is no longer doing unwarranted releases. I think this issue could be closed.
Thanks @hiddeco !
Just a friendly reminder that Helm is not suitable for managing the lifecycle of Kubernetes CRD controllers. CRDs have to be extracted from charts and applied on the cluster with Flux as plain YAMLs, otherwise the controller version will diverge from its API and that can break production in various ways.
@stefanprodan can you elaborate on how CRD should be handled? I use cert-manager (official helm chart) and the prerequisite is to install their CRD first, so basically the CRD is out of the chart and not part of the HelmRelease. Thanks
@talmarco it is described here, however these notes should be for upgrading too, not only install
https://github.com/fluxcd/helm-operator/tree/master/chart/helm-operator#installation
@onedr0p I already have helm-operator CRD installed. My question was how to handle other CRDs (like cert-manager) as I understand from @stefanprodan's answer this was the root cause for constantly upgrading the charts.
When upgrading cert manager you also need to manually apply the crds or commit them to your repo for flux to apply. Same goes for here.
@stevehipwell 1.1.0 has a bug which was fixed in 1.2.0, which requires an update of the CRD.
Thanks @hiddeco, that was what I thought and what we've seen when testing the v1.2.0 release.
Just a friendly reminder that Helm is not suitable for managing the lifecycle of Kubernetes CRD controllers. CRDs have to be extracted from charts and applied on the cluster with Flux as plain YAMLs, otherwise the controller version will diverge from its API and that can break production in various ways.
@stefanprodan related to the above statement could you confirm that skipCRDs is true by default for the HelmRelease custom resources (the docs don't give the default values)? If not I'd be interested to know why? We've manually turned off all in chart CRD values (looking at you cert-manager) and set skipCRDs to true so that Flux can manage our CRDs; but that was based on our own understanding and not any actual documentation when we made this decision (about the time of the v1.1.0 release of the helm-operator).
@stevehipwell skipCRDs has no effect on upgrades as Helm ignores the crds dir, it only works at install time but only if the CRDs are not already applied on the cluster. If you keep the CRDs in Git, Flux will apply them before the HelmReleases, so skipCRDs is not relevant.
@stefanprodan I get the install only path, but it would be safest if you had to manually opt in to installing CRDs rather than that being the default behaviour.
Changing the skipCRDs default to true is not an option as HelmRelease API is at v1 and that would be a breaking change that requires a major version bump.
We are working on HelmRelease v2 as part of the GitOps Toolkit, such a change could make it in v2. Please start a discussion in the toolkit repo and we can discuss it.
Thanks @stefanprodan I will do. Although, out of interest and not wanting to sound like I'm accusing anyone of anything as I'm honestly just interested, weren't the v1.2.0 CRD changes breaking?