source-controller icon indicating copy to clipboard operation
source-controller copied to clipboard

Flux Source Controller Fails to List Remotes

Open devopstagon opened this issue 2 years ago • 4 comments

Describe the bug

Source controller randomly has issues listing revisions from the remote(GitLab in this case) leading to these errors:

{"level":"error","ts":"2023-06-20T12:09:39.735Z","msg":"failed to checkout and determine revision: unable to list remote for 'https://gitlab/sre/gitops/sre-flux': stream error: stream ID 3; INTERNAL_ERROR; received from peer","controller":"gitrepository","controllerGroup":"source.toolkit.fluxcd.io","controllerKind":"GitRepository","GitRepository":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"e258ec4f-35e2-48e5-9af2-f7715f7c4cb4","error":"failed to checkout and determine revision: unable to list remote for 'https://gitlab/sre/gitops/sre-flux': stream error: stream ID 3; INTERNAL_ERROR; received from peer"}
{"level":"error","ts":"2023-06-20T12:09:39.766Z","msg":"Reconciler error","controller":"gitrepository","controllerGroup":"source.toolkit.fluxcd.io","controllerKind":"GitRepository","GitRepository":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"e258ec4f-35e2-48e5-9af2-f7715f7c4cb4","error":"failed to checkout and determine revision: unable to list remote for 'https://gitlab/sre/gitops/sre-flux': stream error: stream ID 3; INTERNAL_ERROR; received from peer"}

The endpoint it calls is up and has no connection issues we can see during this period. We suspect it is a bug in net/http due to this ticket: https://github.com/golang/go/issues/51323

Steps to reproduce

  1. add a source
  2. check the logs and see the intermittent failures

Expected behavior

Source controller handles this error via retries or something instead of failing to get around the bug.

Screenshots and recordings

No response

OS / Distro

Kubernetes 1.24.x

Flux version

v0.38.3

Flux check

► checking prerequisites ✗ flux 0.38.3 <2.0.0-rc.5 (new version is available, please upgrade) ✔ Kubernetes 1.24.12-gke.500 >=1.20.6-0 ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.34.1 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.34.1 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.28.0 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v1.0.0-rc.4 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v1.0.0-rc.4 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v1.0.0-rc.5 ► checking crds ✔ alerts.notification.toolkit.fluxcd.io/v1beta2 ✔ buckets.source.toolkit.fluxcd.io/v1beta2 ✔ gitrepositories.source.toolkit.fluxcd.io/v1 ✔ helmcharts.source.toolkit.fluxcd.io/v1beta2 ✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1 ✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2 ✔ imagepolicies.image.toolkit.fluxcd.io/v1beta2 ✔ imagerepositories.image.toolkit.fluxcd.io/v1beta2 ✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta1 ✔ kustomizations.kustomize.toolkit.fluxcd.io/v1 ✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2 ✔ providers.notification.toolkit.fluxcd.io/v1beta2 ✔ receivers.notification.toolkit.fluxcd.io/v1 ✔ all checks passed

Git provider

GitLab

Container Registry provider

Harbor

Additional context

No response

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

devopstagon avatar Jun 20 '23 12:06 devopstagon

According to this comment, the internal error message you're seeing is coming from the server, so it is most likely to be an upstream issue.

makkes avatar Jun 20 '23 14:06 makkes

@devopstagon Did you manage to solve this issue, I have started seeing this error appear on my cluster coming from source-controller. Unsure why its having a problem.

savisaar2 avatar Feb 14 '24 07:02 savisaar2

We're experiencing the same since a couple of days on GitHub as source.

tomaaron avatar Sep 17 '24 11:09 tomaaron

Flux retries when the connection fails, it’s not much we can do about if GitHub has connectivity issues. See https://www.githubstatus.com/incidents/r3x7x31k7nn1

stefanprodan avatar Sep 17 '24 11:09 stefanprodan