source-controller
source-controller copied to clipboard
upgrade to latest source-controller s3 bucket reconciliation stuck
Hello,
After upgrade flux to the latest version with ghcr.io/fluxcd/source-controller:v0.20.1, s3 bucket reconciliation stuck with error:
source-controller Head "https://cprc-att-gitops.s3.dualstack.us-east-1.amazonaws.com/": context deadline exceeded
after downgrade source-controller version to 0.18.0 it's work again:
source-controller Fetched revision: 7671db410d603f4b16aec56ebc1369893c42e4fb
Hello, sorry you're having trouble and thanks for making a report
I'm currently on source-controller v0.21.1 and can't downgrade at the moment, but I've just tested with my minio https bucket as a source and I didn't have any difficulty with Bucket sources there.
Can you provide your Bucket configuration for the more details to add to the report? Please omit or occlude any secrets of course so that you will not risk any compromise.
It is possible there is something wrong in v0.20.1 and it might be fixed in v0.21.1, but there isn't yet a flux release that includes this later source-controller release, so if that's the case you can either try upgrading on your own, or sit tight until Flux v0.26.0 (which should be released in not too long.)
I'm not sure what has changed from 0.18 to 0.20.1, but if you're able to narrow the specific source-controller release that triggers the issue for you between that range, it will make the job of comparing releases to understand what went wrong that much easier! Please let us know.
Hello @kingdonb
last working version 0.18.0, after upgrade to 0.19.0 got:
Head "https://my-gitops.s3.dualstack.us-east-1.amazonaws.com/": context deadline exceeded
update to latest version flux (0.26.0) with fluxcd/source-controller:v0.21.1, problem still present
it's standard AWS s3 bucket
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: Bucket
metadata:
annotations:
kustomize.toolkit.fluxcd.io/prune: disabled
labels:
kustomize.toolkit.fluxcd.io/name: clusters
kustomize.toolkit.fluxcd.io/namespace: flux-system
name: my-gitops
namespace: flux-system
spec:
bucketName: my-gitops
endpoint: s3.amazonaws.com
interval: 5m
provider: aws
region: us-east-1
timeout: 30s
This does not directly explain why the behavior would be slower in newer controller versions, but have you tried increasing the .spec.timeout to a higher value?
@hiddeco >2 minutes for reconciliation source-controller v0.21.1 :'( .spec.timeout = 4m
❯ time flux reconcile source bucket my-gitops
► annotating Bucket my-gitops in flux-system namespace
✔ Bucket annotated
◎ waiting for Bucket reconciliation
✔ fetched revision 838e3edd32706abfddd363e7dcc30ac5453bc1fd
flux reconcile source bucket my-gitops 0.71s user 0.10s system 0% cpu 2:24.28 total
Are you certain nothing has changed in your bucket?
I see you mentioned:
after downgrade source-controller version to 0.18.0 it's work again
So I'll assume you're correct about this, and it is a change in our build that is at fault. Since Source Controller 0.18 we see that minio-go has been upgraded from v7.0.10 to v7.0.15, it's possible there is some regression in there. In the changes we noticed that a compression library that has been added. That could be responsible.
We are also using a new Go version.
We've just scanned the changelogs and nothing is jumping out. We're talking about ways to trace this issue down in the Bug Scrub now. It would be good if we had some benchmarks in our test suite so that we can track regressions like this in the future, (and so we can tell if we've solved your issue without bouncing back and forth and sending you a test image, and asking you to try again...)
We may send you an image with some items swapped out for earlier versions to see which restores the performance? @uderik - stay tuned for more.
cc: @pjbgf