source-controller icon indicating copy to clipboard operation
source-controller copied to clipboard

TLS error switching source controller from 1.4.0 to 1.5.0

Open ericjohansson89 opened this issue 9 months ago • 4 comments

Hello,

I have the same setup running in multiple clusters.

HelmRepository pointing to private registry

When updating from 1.4.0 to 1.5.0 the HelmRepository in one of our clusters was not able to become ready. When switching back to 1.4.0 it became ready and could be used.

The log in debug mode showed the following

{
  "level": "debug",
  "ts": "2025-03-21T12:36:18.846Z",
  "logger": "events",
  "msg": "failed to fetch Helm repository index: failed to cache index to temporary file: Get \"https://<redacted>/index.yaml\": net/http: TLS handshake timeout",
  "type": "Warning",
  "object": {
    "kind": "HelmRepository",
    "namespace": "k6-system",
    "name": "nexus",
    "uid": "<redacted>",
    "apiVersion": "source.toolkit.fluxcd.io/v1",
    "resourceVersion": "<redacted>"
  },
  "reason": "Failed"
}
{
  "level": "error",
  "ts": "2025-03-21T12:36:18.861Z",
  "msg": "Reconciler error",
  "controller": "helmrepository",
  "controllerGroup": "source.toolkit.fluxcd.io",
  "controllerKind": "HelmRepository",
  "HelmRepository": {
    "name": "nexus",
    "namespace": "k6-system"
  },
  "namespace": "k6-system",
  "name": "nexus",
  "reconcileID": "<redacted>",
  "error": "failed to fetch Helm repository index: failed to cache index to temporary file: Get \"https://<redacted>/index.yaml\": net/http: TLS handshake timeout",
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:341\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:288\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:249"
}

As mentioned, it works in other clusters, it works when switching back to 1.4.0. Please let me know if you need anything more

ericjohansson89 avatar Mar 21 '25 12:03 ericjohansson89

Hello,

We also face the same issue on multiple clusters and have downgraded flux to the previous version. In our case, the affected repository (hosted on Harbor) is behind a reverse proxy (nginx).

ftqualifio avatar Apr 17 '25 06:04 ftqualifio

Since multiple users are reporting the same TLS handshake timeout after upgrading to source-controller 1.5.0 (while it works on 1.4.0), this appears to be a regression or change at the Flux/source-controller code level, rather than an environment- or configuration-specific issue. It likely relates to how the new version handles secure (TLS) connections. Could the maintainers look into any changes in TLS or network handling introduced in 1.5.0?

itzPranshul avatar Jul 29 '25 05:07 itzPranshul

We need more information on the setups: Are you using custom certificates? Are these internal helm repo servers? How are they deployed? etc. etc. etc. Please add the most information you can to this thread

matheuscscp avatar Jul 29 '25 12:07 matheuscscp

I wonder how many seconds it took for the TLS handshake to timeout 🤔

cappyzawa avatar Jul 30 '25 12:07 cappyzawa