helm-controller icon indicating copy to clipboard operation
helm-controller copied to clipboard

Helm upgrade failed suddenly. Start showing it has not deployed release and hr reconcile failed

Open abhi93104 opened this issue 3 years ago • 4 comments

When doing reconciliation of helm release, sometimes it starts showing "Helm upgrade failed: project has no deployed releases." It's happening once in 2-3 days.

Q: Is helm controller restarted? A: No

kubectl -n flux-system get pods helm-controller-6b456768d5-mcwcc
NAME                               READY   STATUS    RESTARTS   AGE
helm-controller-6b456768d5-mcwcc   1/1     Running   0          2d5h

project = fe-stack Error:

flux get hr fe-stack -n qa-team
NAME    	READY	MESSAGE                                                 	REVISION	SUSPENDED 
fe-stack	False	Helm upgrade failed: "fe-stack" has no deployed releases	        	False   
kubectl describe helmreleases.helm.toolkit.fluxcd.io fe-stack -n qa-team
Last Helm logs:

preparing upgrade for fe-stack
resetting values to the chart's original version
performing update for fe-stack
creating upgraded release for fe-stack
    Reason:                        UpgradeFailed
    Status:                        False
    Type:                          Released
  Failures:                        1
  Helm Chart:                      qa-team/qa-team-fe-stack
  Last Attempted Revision:         v5.0.27
  Last Attempted Values Checksum:  6a7c1ebdbb475c995c093799e1f3824926229e71
  Last Handled Reconcile At:       2021-05-07T17:54:54.795085788+05:30
  Observed Generation:             22
  Upgrade Failures:                1
Events:
  Type    Reason  Age                    From             Message
  ----    ------  ----                   ----             -------
  Normal  error   19m (x19 over 120m)    helm-controller  reconciliation failed: Helm upgrade failed: "fe-stack" has no deployed releases
  Normal  info    9m53s (x30 over 29h)   helm-controller  Helm upgrade has started
  Normal  error   9m47s (x21 over 120m)  helm-controller  Helm upgrade failed: "fe-stack" has no deployed releases

Logs with debug level: hr-controller.log

abhi93104 avatar May 07 '21 13:05 abhi93104

Hi @stefanprodan this is a production blocker for us to go live with fluxV2. I can also contribute if that helps.

Thank You

abhi93104 avatar May 12 '21 17:05 abhi93104

I suspect this to be related to / another version of #149, with a rich history of helm users themselves running into it:

https://github.com/helm/helm/issues/5595 https://github.com/helm/helm/issues/7160

Judging on the shared logs, there seems to be a correlation with the --wait behavior from Helm, that seems to corrupt the storage in some edge cases.

hiddeco avatar May 12 '21 17:05 hiddeco

It's happening a lot of time during development. Is there are any workaround without deleting the helmrelease?

abhi93104 avatar Jun 02 '21 13:06 abhi93104

Its a manual activity but this helped the stuck release: https://github.com/helm/helm/issues/5595#issuecomment-717024123

abhi93104 avatar Jun 02 '21 13:06 abhi93104