flux2 icon indicating copy to clipboard operation
flux2 copied to clipboard

flux reconcile hr: double upgrade/redeployment

Open patrikbeno opened this issue 2 years ago • 8 comments

Describe the bug

flux reconcile hr --with-source causes 2 helm upgrades.

Proof is in the helm history.

For simple deployments, this may be just an annoyance because the 2nd upgrade will be a no-op. However, if you have some post-upgrade hooks configured, they will be executed twice, once for every upgrade. This may or may not be an issue but it is certainly undesired.

Steps to reproduce

helm create qdh
cd qdh
git commit
git push

...

flux create source git --interval=1h  ... 
flux create helmrelease --interval=1h  ...

...

update chart version, commit, push

flux reconcile hr --with-source ...

helm history qdh

Expected behavior

single helm upgrade, of course

Screenshots and recordings

No response

OS / Distro

Windows 10 + Alpine Linux in K8s

Flux version

0.17.2

Flux check

$ flux check ► checking prerequisites ✔ kubectl 1.20.2 >=1.18.0-0 ✔ Kubernetes 1.21.5 >=1.16.0-0 ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.11.2 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v0.14.1 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v0.16.0 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v0.15.4 ✔ all checks passed

Git provider

No response

Container Registry provider

No response

Additional context

Workaround:

name=qdh
namespace=default

# reconcile git source instead of HelmRelease
flux reconcile source git gitops -n $namespace

# this causes HelmChart update
kubectl wait --for=condition=ready HelmChart $namespace-$name -n $namespace

# this casues HelmRelease reconcile
kubectl wait --for=condition=ready HelmRelease $name -n $namespace --timeout=5m

# now it's done & ok
flux get hr $name -n $namespace
helm  history $name -n $namespace --max=2

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

patrikbeno avatar Oct 01 '21 11:10 patrikbeno

This happens because the source is annotated first(which triggers the first upgrade as the helm-controller watches it) and then the helm release is annotated again(second upgrade). Not sure if annotating just the source object might be sufficient to fix this @hiddeco

somtochiama avatar Oct 01 '21 11:10 somtochiama

If the GitRepository has interval=1m and HelmRelease has interval=1h, then chart update in GitRepository triggers HelmRelease reconciliation after 1m even if HelmRelease should be in quiet mode for next hour.

Feature/Bug? Dunno. Just unexpected.

Seems GitRepository updates should not trigger HelmRelease reconciliation as side effect.

HelmChart is updated from GitRepository but since HelmChart belongs to HelmRelease, its updates should not be driven by GitRepository events, but HelmRelease events instead.

patrikbeno avatar Oct 01 '21 12:10 patrikbeno

then chart update in GitRepository triggers HelmRelease reconciliation

Yes all Flux reconcilers are triggered by source changes that's by design.

stefanprodan avatar Oct 01 '21 12:10 stefanprodan

Yes all Flux reconcilers are triggered by source changes that's by design.

I get that, to a point.

HelmRelease has a HelmChart source which has a GitRepository source

interval parameter or explicit reconcile commands indicate that it's a pull model.

What you say indicates push model. If so, interval on a HelmRelease does not make much sense, since it's driven by its source anyway.

patrikbeno avatar Oct 01 '21 12:10 patrikbeno

If so, interval on a HelmRelease does not make much sense, since it's driven by its source anyway.

The interval is there because Kubernetes events are not guaranteed, due to API server congestion and other network related issues is possible that the controller doesn't receive the update event. No matter if the event is received or not, Flux uses the interval to reconcile the cluster to the latest revision. Another function of interval is that Flux can correct drift in-cluster, such as a manual helm upgrade that is rollback at the specified interval.

stefanprodan avatar Oct 01 '21 13:10 stefanprodan

Shouldn't second reconciliation (interval triggered) of HelmRelease not update anything since the one started by source update already made all necessary changes?

And might there by some issue with how helm-controller treats simultaneously running reconciliations, as in issues linked above?

tbondarchuk avatar Oct 08 '21 14:10 tbondarchuk

A resource is never worked on by two processes at the same time as the queue won't allow this. What you are observing in that issue is likely a race condition that occurs between an apply and a condition related update.

These errors will no longer happen in an upcoming (to be announced) release, as we will be introducing a more intelligent patcher in the near future: https://pkg.go.dev/github.com/fluxcd/pkg/[email protected]/patch

hiddeco avatar Oct 08 '21 14:10 hiddeco

@hiddeco

These errors will no longer happen in an upcoming (to be announced) release, as we will be introducing a more intelligent patcher in the near future: https://pkg.go.dev/github.com/fluxcd/pkg/[email protected]/patch

Any updates on this one?

philipsabri avatar Sep 12 '22 12:09 philipsabri

Fixed in https://github.com/fluxcd/helm-controller/pull/738

stefanprodan avatar Nov 24 '23 08:11 stefanprodan