helm-controller
helm-controller copied to clipboard
Helm controller has incorrect behavior when dealing with duplicate environment variables
Helm controller allows creating a Helm Release with a duplicate environment variable, it deploys with just a warning about the duplication.
However, when removing one of the duplicate values Helm controller will remove all copies of that environment variable.
So if the deployment has:
env:
- name: ihave
value: "apple"
- name: ihave
value: "pen"
and values are set to:
values:
env:
- name: ihave
value: "apple"
The resulting deployment env becomes:
env: []
The only way to restore the deployment is to delete the manifest and force reconciliation or add an additional environment variable.
When trying to add duplicate values to an existing HelmRelease however the new duplicates are ignored.
env:
- name: ihave
value: "apple"
and values are set to:
values:
env:
- name: ihave
value: "apple"
- name: ihave
value: "pen"
The resulting deployment env becomes:
env:
- name: ihave
value: "apple"
The behavior should be consistent, helm controller should never allow the duplicate values or deal with them correctly on removal.
Test case Helm Release
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: bad-deployment
spec:
releaseName: bad-deployment
chart:
spec:
chart: bad-deployment
sourceRef:
kind: GitRepository
name: bad-deployment
interval: 5m
install:
remediation:
retries: 3
values:
name: bad-deployment-test
env:
- name: ihave
value: "apple"
- name: ihave
value: "pen"
Relevant part of Helm Chart template
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Values.name }}
labels:
app: {{ .Values.name }}
release: {{ .Release.Name }}
spec:
strategy:
type: RollingUpdate
selector:
matchLabels:
app: {{ .Values.name }}
template:
metadata:
labels:
app: {{ .Values.name }}
release: {{ .Release.Name }}
spec:
containers:
- name: {{ .Values.name }}
image: "busybox:latest"
command: ["/bin/sleep", "10d"]
env:
{{- if .Values.env}}
{{- toYaml .Values.env | nindent 10 }}
{{- end}}
Could you let us know what version of Flux is installed?
I'm looking at this issue and it reminds me of an issue that affected Flux's "new" Server Side Apply when it was still very new, but it's been resolved for some time. It might not be the same issue, but in the interest of ruling it out quickly, what is the output of flux check
(it includes the version information)
Also, maybe this: was there any error when the HelmRelease change was not successfully applied? (It seems the changes were ignored, maybe it was an error that actually prevented the reconciliation from completing.)
Eg. at this point,
and values are set to:
values:
env:
- name: ihave
value: "apple"
The resulting deployment env becomes:
env: []
it seems like the change perhaps hasn't actually been applied, can you check if there is any error reconciling the HelmRelease or parent Kustomization that might explain why this happened?
This could be a bug in Helm's three-way merge, or something else. Thank you for reporting this! I think you've given us enough information here to reproduce it, though it may take some time as we are working on issues for GA as well as KubeCon, (but if this is a bug then obviously we'd like to squash it before GA.)
Could you let us know what version of Flux is installed?
I'm looking at this issue and it reminds me of an issue that affected Flux's "new" Server Side Apply when it was still very new, but it's been resolved for some time. It might not be the same issue, but in the interest of ruling it out quickly, what is the output of
flux check
(it includes the version information)
Currently on v0.24 of the Helm Controller.
it seems like the change perhaps hasn't actually been applied, can you check if there is any error reconciling the HelmRelease or parent Kustomization that might explain why this happened?
Test manifests where applied directly with kubectl
so Kustomization Controller was not involved.
As for Helm Controller log it shows nothing relevant, simply that reconciliation finished:
time | message |
---|---|
2022-10-06 09:17:23 | msg="reconcilation finished in 57.883487ms, next run in 5m0s" |
2022-10-06 09:15:30 | msg="reconcilation finished in 45.435263ms, next run in 5m0s" |
2022-10-06 09:12:11 | msg="reconcilation finished in 9.175023225s, next run in 5m0s" |
2022-10-06 09:10:43 | msg="reconcilation finished in 12.716648708s, next run in 5m0s" |
2022-10-06 09:10:30 | msg="reconcilation finished in 83.638857ms, next run in 5m0s" |
I wonder if this is a bug in Flux at all, or if it's even a bug in Helm
We need to see if this behavior reproduces in the Helm CLI if you try a similar thing. I'm wondering if it's somehow part of the three-way merge behavior. You mentioned that you did kubectl apply
and aren't using Kustomize controller – just to clarify that could be part of the issue as well. If you did kubectl apply
then it's using a three-way merge, and you need to be quite sure the HelmRelease that's represented on the cluster says what you think. It might have merged the values in the array.
If you were using Kustomize controller here, it does not use kubectl apply
with three-way merge in quite the same way, Flux will actually wipe out any drift (unless you needed three-way merge, and there's a way to opt into it) but in the kubectl
cli for Kubernetes, kubectl apply -f
by default is a three-way merge, and you can get strange behavior from it. Broadly speaking, the server-side apply feature in Kustomize controller fixes this.
If you can confirm whether the YAML on the cluster has really been affected, one way to see is to use kubectl edit
in between and verify the YAML looks how you expect, another way to mitigate this problem would be to see if it behaves the same when you do kubectl replace
instead of kubectl apply
.
Can you check on these avenues so we can verify this is a bug before we spent more time to investigate further?
The HelmRelease was correct with both kubectl
and Kustomize controller, I just retested the test case to be 100% sure and when removing the duplicate value the HelmRelease
shows the remaining env variable as expected, the same behavior is seen when using the controller.
The helm rendered Deployment
gets all env variable entries removed, further reconciliation does not add the existing var in the HelmRelease
back to the deployment.
I hope that clarifies things.
Thanks @daniel-anova – I hope to see that someone can follow this up before too long after KubeCon.
I won't be able to look at it again before then myself