helm-controller
helm-controller copied to clipboard
Error notifications despite the resource being successfully reconciled
This issue was first opened at https://github.com/fluxcd/flux/issues/3480
Describe the bug
Flux sends out error level notifications despite the resource being successfully reconciled. This is the discord notification I received, in that order:
[info] helmrelease/jellyfin.media
Helm upgrade has started
revision
7.3.2
[info] helmrelease/jellyfin.media
Helm upgrade succeeded
revision
7.3.2
[error] helmrelease/jellyfin.media
reconciliation failed: Operation cannot be fulfilled on helmreleases.helm.toolkit.fluxcd.io "jellyfin": the object has been modified; please apply your changes to the latest version and try again
revision
7.3.2
And when I checked later:
$ flux get helmrelease -n media jellyfin
NAME READY MESSAGE REVISION SUSPENDED
jellyfin True Release reconciliation succeeded 7.3.2 False
All of this happened in a 2 minutes time window between the start of the reconciliation and the error notification.
To Reproduce
Hard to tell. No manual intervention was made besides updating the docker image in the values of the chart on the gitops repository, all those resources are managed by flux. Last time jellyfin was reconciled it worked fine. A week ago grafana reconciliation had the same error but not after so it does not seem to be related to a helm chart in particular. My guess is that there is a conflict because flux tries to run two reconciliations at the same time of the same resource.
Expected behavior
Error notifications sent when reconciliation actually fails, maybe for a longer period of time? At least make this maybe a warning level on first occurrence. I am not sure what should be done, but throwing an error seems wrong.
Please post here the output of flux check
Sure
► checking prerequisites
✔ kubectl 1.21.0 >=1.18.0-0
✔ Kubernetes 1.20.6+k3s1 >=1.16.0-0
► checking controllers
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.12.0
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.10.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.13.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.12.1
✔ all checks passed
Any update on this? The same happens here. In the chronological order:
prerequisites:
- Developer pushes a piece of code
- CI system test/builds and pushes new tagged image to the ECR
- According to the configured
ImageRepository
andImagePolicy
a new tag is detected
-
Image Update Automation
controller commit a new tag to the Git repo -
Source Controller
updates repo ..somewhere inside Flux Pod - 3.2. and 3.1. came to the same moment, thus probably
Kustomization
controller was the first who updated atag
value inHelmRelease
and sent the event to the Slack channel, thenHelm Controller
started upgrade process - A new image was successfully deployed.
- Failed ...why? what went wrong?
kubectl get -o yaml helmrelease ...
I'd appreciate any help or idea how to debug that issue
I suspect it might be caused by two reconciliation happening at the same time: one set by HelmRelease's interval, second triggered by source update. I've seen the same issue but only from time to time, usually for me it's bunch of releases updated properly and one or two: first success then "object has been modified" error. Though I haven't seen such errors from Kustiomization, so maybe helm-controller treats already running reconciliation somewhat differently then kustomize-controller?
Something similar is described in https://github.com/fluxcd/flux2/issues/1882 could they be connected?
It seems to be still happening. At least It happened in 1/2 of our clusters. Upgrade went through fine as described in this issue
helmrelease/sde.sde
Helm upgrade has started
revision
2022.2.0-external2
helmrelease/sde.sde
Helm upgrade succeeded
revision
2022.2.0-external2
helmrelease/sdesde
reconciliation failed: Operation cannot be fulfilled on [helmreleases.helm.toolkit.fluxcd.io](http://helmreleases.helm.toolkit.fluxcd.io/) "sde": the object has been modified; please apply your changes to the latest version and try again
revision
2022.2.0-external2
helmrelease/sde.sde
reconciliation failed: Operation cannot be fulfilled on [helmreleases.helm.toolkit.fluxcd.io](http://helmreleases.helm.toolkit.fluxcd.io/) "sde": the object has been modified; please apply your changes to the latest version and try again
revision
2022.2.0-external2
kustomization/helmcharts.flux-system
Health check passed in 20.121041632s
revision
main/7701d7768535a34ca4b53df88d822f65beecb4ed