helm-operator
helm-operator copied to clipboard
Cannot unset default values
Describe the bug
In a chart, you can define default values for templates in the chart's values.yaml file. Sometimes it is desired to override them, but other times, you simply don't want the default to be used at all. To support this, if you set the value to null (or ~ for yaml) then helm will unset the default. https://github.com/helm/helm/pull/2648 is where that was implemented.
However, using helm-operator (with helm 2.16.1, not sure about v3), this does not work. If you set a value to null, the default value in the chart is still used. Using helm template and helm upgrade --install with the same set of values produces the correct behavior.
To Reproduce Steps to reproduce the behaviour:
I'm experiencing this with the Loki chart from Grafana labs when trying to unset the liveness probe. See https://github.com/grafana/loki/blob/master/production/helm/loki/values.yaml#L77-L81 to notice the default value of livenessProbe set to an httpGet probe and https://github.com/grafana/loki/blob/master/production/helm/loki/templates/statefulset.yaml#L68-L69 for where it's used.
- Install loki:
helm2 upgrade --install --namespace logging loki loki/loki - Check the liveness probe is set on the statefulset:
kubectl get statefulset loki -n logging -o json | jq '.spec.template.spec.containers[0].livenessProbe' - Re-run helm upgrade with
spec.livenessProbeset tonull:helm2 upgrade --install --namespace logging loki loki/loki --set livenessProbe=null - Verify the livenessProbe is removed:
kubectl get statefulset loki -n logging -o json | jq '.spec.template.spec.containers[0].livenessProbe'
This verifies the behavior of helm without helm-operator.
Next do the same process with helm-operator:
- Create the HelmRelease to install Loki:
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
name: loki
namespace: logging
spec:
releaseName: loki
chart:
repository: https://grafana.github.io/loki/charts
name: loki
version: 0.22.0
- Verify the liveness probe is set:
kubectl get statefulset loki -n logging -o json | jq '.spec.template.spec.containers[0].livenessProbe' - Update the HelmRelease to unset the livenessProbe:
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
name: loki
namespace: logging
spec:
releaseName: loki
chart:
repository: https://grafana.github.io/loki/charts
name: loki
version: 0.22.0
values:
livenessProbe: null
- Notice that after helm-operator finishes reconciling,
livenessProbeis still set on the statefulset:kubectl get statefulset loki -n logging -o json | jq '.spec.template.spec.containers[0].livenessProbe'
Expected behavior
helm-operator should properly handle unsetting values when they're set to null.
Logs This is all logs related to loki over 10 minutes (after trying multiple times):
kubectl logs helm-operator-67f4f7d9-dcqc6 -c flux-helm-operator --namespace flux --since 10m
ts=2020-01-13T18:50:18.641083245Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:50:19.245609477Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:50:19.642975212Z caller=chartsync.go:640 component=chartsync info="release loki: values have diverged" resource=logging:helmrelease/loki diff=" &chart.Config{\n \tRaw: strings.Join({\n \t\t... // 64 identical lines\n \t\t\"extraArgs:\",\n \t\t\" consul.hostname: $(HOST_IP):8500\",\n- \t\t\"livenessProbe: null\",\n+ \t\t\"livenessProbe: {}\",\n \t\t\"persistence:\",\n \t\t\" enabled: true\",\n \t\t... // 4 identical lines\n \t}, \"\\n\"),\n \tValues: nil,\n \tXXX_NoUnkeyedLiteral: struct{}{},\n \t... // 2 identical fields\n }\n"
ts=2020-01-13T18:50:19.657141577Z caller=release.go:184 component=release info="processing release loki (as loki)" action=UPDATE options="{DryRun:false ReuseName:false}" timeout=300s
ts=2020-01-13T18:51:18.641278153Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:51:19.026561679Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:52:18.641023263Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:52:19.153667127Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:53:18.641549522Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:53:18.973288149Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:54:18.640975695Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:54:18.973275328Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:54:19.183328651Z caller=chartsync.go:640 component=chartsync info="release loki: values have diverged" resource=logging:helmrelease/loki diff=" &chart.Config{\n \tRaw: strings.Join({\n \t\t... // 64 identical lines\n \t\t\"extraArgs:\",\n \t\t\" consul.hostname: $(HOST_IP):8500\",\n- \t\t\"livenessProbe: null\",\n+ \t\t\"livenessProbe: {}\",\n \t\t\"persistence:\",\n \t\t\" enabled: true\",\n \t\t... // 4 identical lines\n \t}, \"\\n\"),\n \tValues: nil,\n \tXXX_NoUnkeyedLiteral: struct{}{},\n \t... // 2 identical fields\n }\n"
ts=2020-01-13T18:54:19.188866569Z caller=release.go:184 component=release info="processing release loki (as loki)" action=UPDATE options="{DryRun:false ReuseName:false}" timeout=300s
ts=2020-01-13T18:55:18.64139792Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:55:19.426737374Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:56:18.642208496Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:56:19.798843225Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:57:18.642694543Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:57:19.966942031Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:57:59.99398505Z caller=operator.go:309 component=operator info="enqueuing release" diff=" v1.HelmReleaseSpec{\n \t... // 2 identical fields\n \tValueFileSecrets: nil,\n \tValuesFrom: nil,\n \tHelmValues: v1.HelmValues{\n \t\tValues: chartutil.Values{\n \t\t\t\"config\": map[string]interface{}{\"auth_enabled\": bool(false), \"chunk_store_config\": map[string]interface{}{\"max_look_back_period\": string(\"0\")}, \"ingester\": map[string]interface{}{\"chunk_block_size\": int64(262144), \"chunk_idle_period\": string(\"15m\"), \"lifecycler\": map[string]interface{}{\"heartbeat_period\": string(\"5s\"), \"ring\": map[string]interface{}{\"heartbeart_timeout\": string(\"1m\"), \"kvstore\": map[string]interface{}{\"consul\": map[string]interface{}{\"consistentreads\": bool(true), \"httpclienttimeout\": string(\"20s\")}, \"prefix\": string(\"loki-collectors/\"), \"store\": string(\"consul\")}, \"replication_factor\": int64(2)}}}, \"ingester_client\": map[string]interface{}{\"grpc_client_config\": map[string]interface{}{\"max_recv_msg_size\": int64(67108864), \"max_send_msg_size\": int64(67108864)}}, \"limits_config\": map[string]interface{}{\"enforce_metric_name\": bool(false), \"reject_old_samples\": bool(true), \"reject_old_samples_max_age\": string(\"168h\")}, \"schema_config\": map[string]interface{}{\"configs\": []interface{}{map[string]interface{}{\"from\": string(\"2019-12-01\"), \"index\": map[string]interface{}{\"period\": string(\"168h\"), \"prefix\": string(\"loki_logs_infra_sandbox_\")}, \"object_store\": string(\"s3\"), \"schema\": string(\"v10\"), \"store\": string(\"aws\")}}}, \"server\": map[string]interface{}{\"graceful_shutdown_timeout\": string(\"30s\"), \"grpc_server_max_recv_msg_size\": int64(67108864), \"grpc_server_max_send_msg_size\": int64(67108864), \"http_listen_port\": int64(3100), \"http_server_idle_timeout\": string(\"30s\")}, \"storage_config\": map[string]interface{}{\"aws\": map[string]interface{}{\"dynamodbconfig\": map[string]interface{}{\"dynamodb\": string(\"dynamodb://us-west-2\")}, \"s3\": string(\"s3://us-west-2/rigetti-loki-infra-sandbox\")}}, \"table_manager\": map[string]interface{}{\"retention_deletes_enabled\": bool(true), \"retention_period\": string(\"8736h\")}},\n \t\t\t\"env\": []interface{}{map[string]interface{}{\"name\": string(\"AWS_ACCESS_KEY_ID\"), \"valueFrom\": map[string]interface{}{\"secretKeyRef\": map[string]interface{}{\"key\": string(\"AWS_ACCESS_KEY_ID\"), \"name\": string(\"loki-aws-credentials\")}}}, map[string]interface{}{\"name\": string(\"AWS_SECRET_ACCESS_KEY\"), \"valueFrom\": map[string]interface{}{\"secretKeyRef\": map[string]interface{}{\"key\": string(\"AWS_SECRET_ACCESS_KEY\"), \"name\": string(\"loki-aws-credentials\")}}}, map[string]interface{}{\"name\": string(\"HOST_IP\"), \"valueFrom\": map[string]interface{}{\"fieldRef\": map[string]interface{}{\"fieldPath\": string(\"status.hostIP\")}}}},\n \t\t\t\"extraArgs\": map[string]interface{}{\"consul.hostname\": string(\"$(HOST_IP):8500\")},\n- \t\t\t\"livenessProbe\": map[string]interface{}{},\n \t\t\t\"persistence\": map[string]interface{}{\"enabled\": bool(true)},\n \t\t\t\"replicas\": int64(3),\n \t\t\t\"serviceMonitor\": map[string]interface{}{\"enabled\": bool(true)},\n \t\t},\n \t},\n \tTargetNamespace: \"\",\n \tTimeout: nil,\n \t... // 3 identical fields\n }\n" resource=logging:helmrelease/loki
ts=2020-01-13T18:58:00.176394243Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:58:00.272287984Z caller=chartsync.go:640 component=chartsync info="release loki: values have diverged" resource=logging:helmrelease/loki diff=" &chart.Config{\n \tRaw: strings.Join({\n \t\t... // 64 identical lines\n \t\t\"extraArgs:\",\n \t\t\" consul.hostname: $(HOST_IP):8500\",\n- \t\t\"livenessProbe: {}\",\n \t\t\"persistence:\",\n \t\t\" enabled: true\",\n \t\t... // 4 identical lines\n \t}, \"\\n\"),\n \tValues: nil,\n \tXXX_NoUnkeyedLiteral: struct{}{},\n \t... // 2 identical fields\n }\n"
ts=2020-01-13T18:58:00.276730131Z caller=release.go:184 component=release info="processing release loki (as loki)" action=UPDATE options="{DryRun:false ReuseName:false}" timeout=300s
ts=2020-01-13T18:58:18.642683225Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:58:19.205106514Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
ts=2020-01-13T18:59:18.642003196Z caller=operator.go:309 component=operator info="enqueuing release" resource=logging:helmrelease/loki
ts=2020-01-13T18:59:19.085401766Z caller=release.go:184 component=release info="processing release loki (as c6658203-2386-11ea-894d-026e80628430)" action=CREATE options="{DryRun:true ReuseName:false}" timeout=300s
Additional context Add any other context about the problem here, e.g
- Helm operator version: 1.0.0-rc4
- Kubernetes version: 1.13
Okay, so I managed to actually get this to work somehow. I think a fresh install of the HelmRelease was what made it work. So it may be that this is only broken on upgrades, but not install. I'm still trying to debug this.
@chancez looks like the same happens with Helm3, had a quick poke over lunch and found something interesting, lets say you apply:
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
name: loki
spec:
releaseName: loki
chart:
repository: https://grafana.github.io/loki/charts
name: loki
version: 0.22.0
Then kubectl get hr -o yaml loki:
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"helm.fluxcd.io/v1","kind":"HelmRelease","metadata":{"annotations":{},"name":"loki","namespace":"default"},"spec":{"chart":{"name":"loki","repository":"https://grafana.github.io/loki/charts","version":"0.22.0"},"releaseName":"loki"}}
creationTimestamp: "2020-01-13T20:42:48Z"
generation: 3
name: loki
namespace: default
resourceVersion: "384577"
selfLink: /apis/helm.fluxcd.io/v1/namespaces/default/helmreleases/loki
uid: 4e97de1b-345f-489c-a178-344d5254445a
spec:
chart:
name: loki
repository: https://grafana.github.io/loki/charts
version: 0.22.0
releaseName: loki
status:
conditions:
- lastTransitionTime: "2020-01-13T20:43:49Z"
lastUpdateTime: "2020-01-13T20:50:20Z"
message: Helm release sync succeeded
reason: HelmSuccess
status: "True"
type: Released
observedGeneration: 3
releaseName: loki
releaseStatus: pending-upgrade
revision: 0.22.0
valuesChecksum: ca3d163bab055381827226140568f3bef7eaac187cebd76878e0b63e9e442356
Then apply the following:
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
name: loki
spec:
releaseName: loki
chart:
repository: https://grafana.github.io/loki/charts
name: loki
version: 0.22.0
values:
livenessProbe: null
And check the HelmRelease again, still no values, so looks to be something with the CRD, if you add the null value on first apply it appears guessing something to do with how the merge is done on apply but not 100% sure.
Ok had to read the doco on merging during apply again but I think it is the normal behaviour, reading the docs on clearing fields:
https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/#how-apply-calculates-differences-and-merges-changes
Calculate the fields to delete by reading values from last-applied-configuration and comparing them
to values in the configuration file. Clear fields explicitly set to null in the local object configuration file
regardless of whether they appear in the last-applied-configuration. In this example,
minReadySeconds appears in the last-applied-configuration annotation, but does not appear in the
configuration file. Action: Clear minReadySeconds from the live configuration.
Were you able to check the statefulset? I was finding setting the livenessProbe: null is working fine, but the changing of the statefulset's livenessProbe isn't. Also the logs I provide show that the helm-operator is even seeing the livenessProbe: null, but it's not propegating to the actual statefulset after helm upgrade from the operator.
Or are you suggesting the livenessProbe: null isn't making it into the HelmRelease because apply is omitting it?
Correct @chancez I was seeing the latter, never even made it in there during an update and I suspect it is due to the merge on apply but hopefully someone with some more knowledge here can clear things up.
Ahhh. I'll test this out. That could definitely be it.
@chancez from more digging it appears as you can either use kubectl replace which will not use this merging behavior or change the chart in question to contain an livenessProbe.enabled field so things do not rely on the nulling out behavior.
@chancez has this been resolved with Stefan his answers?
I haven't had a chance to verify the behaviors, but based on the discussion everything Stefan has said makes sense to me.
Closing this then, feel free to re-open if it does not turn out to be true.
i have question when i run the helm command with this set values:
helm install loki/loki-stack -n loki --set loki.serviceName=loki,fluent-bit.enabled=true,promtail.enabled=true,loki.persistence.enabled=true,loki.persistence.size=40Gi,config.table_manager.retention_deletes_enabled=true,config.table_manager.retention_period=720h --namespace=monitoring
then when i exec to the pod to /etc/loki/loki.yaml it found the values of retention as it is it did not change
@amribrahim what is your precise question and how is it related to the Helm operator?
it's related to set values for helm command it need override values for loki using helm but when it's deployed the value did not change so this is related to the topic
This issue is not for helm but for the Helm operator, an operator to perform releases using a custom resource instead of manually using the helm binary.
I'd advice you to look for help in one of the available Slack channels for the problem you are having with Helm: https://github.com/helm/helm#community-discussion-contribution-and-support
Hi all, sorry if I'm mistaken in replying here, but it doesn't look like there was a way to unset default values through updating the HelmRelease resource. Is this intended?
Same problem here.
We are in a catch 22, helm method to unset default values is setting them to null, but since this has to be first set in the HelmRelease CRD the first apply cmd will clear these entries before CRD is stored in the k8s api server, therefore never even get to helm operator itself.
Same problem here.
We are in a catch 22, helm method to unset default values is setting them to null, but since this has to be first set in the HelmRelease CRD the first apply cmd will clear these entries before CRD is stored in the k8s api server, therefore never even get to helm operator itself.
yes, I think it should be reopened
I am not sure if there is a breaking change or something else that would justify reopening this issue on Helm Operator, but since the issues in the queue have been somewhat neglected, I am looking into doing similar queue hygiene on this as what has been done before on the fluxcd/flux legacy repository.
To be clear, Helm Operator is fully deprecated according to Flux 1 Helm Operator code freeze – no further updates except CVEs which has been in effect since June 2021, there is little chance this issue gets addressed in Helm Operator unless a fix is provided.
https://fluxcd.io/docs/migration/timetable/
If you've moved on to Helm Controller, in Flux v2, and your issue is not with Helm Operator, apologies for adding to the noise. This is not the Flux v2 project, and if you are genuinely still using Helm Operator, we strongly advise moving on to Flux v2 with the migration guide: https://fluxcd.io/docs/migration/helm-operator-migration/
Sorry if your issue remains unresolved. The Helm Operator is in maintenance mode, we recommend everybody upgrades to Flux v2 and Helm Controller.
A new release of Helm Operator is out this week, 1.4.4.
We will continue to support Helm Operator in maintenance mode for an indefinite period of time, and eventually archive this repository.
Please be aware that Flux v2 has a vibrant and active developer community who are actively working through minor releases and delivering new features on the way to General Availability for Flux v2.
In the mean time, this repo will still be monitored, but support is basically limited to migration issues only. I will have to close many issues today without reading them all in detail because of time constraints. If your issue is very important, you are welcome to reopen it, but due to staleness of all issues at this point a new report is more likely to be in order. Please open another issue if you have unresolved problems that prevent your migration in the appropriate Flux v2 repo.
Helm Operator releases will continue as possible for a limited time, as a courtesy for those who still cannot migrate yet, but these are strongly not recommended for ongoing production use as our strict adherence to semver backward compatibility guarantees limit many dependencies and we can only upgrade them so far without breaking compatibility. So there are likely known CVEs that cannot be resolved.
We recommend upgrading to Flux v2 which is actively maintained ASAP.
I am going to go ahead and close every issue at once today, Thanks for participating in Helm Operator and Flux! 💚 💙