source-controller icon indicating copy to clipboard operation
source-controller copied to clipboard

Receiving chart pull error on environment with a proxy - EOF

Open Valgueiro opened this issue 1 year ago • 9 comments

Environment

I have my k8s cluster deployed behind a firewall, that only allows connections from a proxy that is on the same network.

image

Setup

Flux version: v2.1.2 Source controller version: 1.1.2 I've setup the gotk as such to be able to use the proxy to fetch things.

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - gotk-components.yaml
patches:
  - patch: |
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: all
      spec:
        template:
          spec:
            containers:
              - name: manager
                env:
                  - name: "HTTPS_PROXY"
                    value: "http://proxy.com:3128"
                  - name: "NO_PROXY"
                    value: ".cluster.local.,.cluster.local,cluster.local,.svc,127.0.0.0/8,10.0.0.0/8"  
                  - name: "https_proxy"
                    value: "http://proxy.com:3128"
                  - name: "no_proxy"
                    value: ".cluster.local.,.cluster.local,cluster.local,.svc,127.0.0.0/8,10.0.0.0/8"     
    target:
      kind: Deployment
      labelSelector: app.kubernetes.io/part-of=flux

And I have HelmRelease and helmrepo configured like this:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: keda
  namespace: keda
spec:
  interval: 5m0s
  releaseName: keda
  install:
    createNamespace: true
  chart:
    spec:
      chart: keda
      version: '2.12.1'
      sourceRef:
        kind: HelmRepository
        name: charts
        namespace: keda
  valuesFrom:
  - kind: ConfigMap
    name: keda-values
---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: charts
  namespace: keda
spec:
  type: "oci"
  interval: 5m0s
  url: oci://<acr>/sre/charts/
  secretRef:
    name: registry-pull-secret
  certSecretRef:
    name: tls-ca

My HelmRepo is showing as active, but the HelmChart is showing as "Reconciling" and I can see the following error:

chart pull error: failed to download chart for remote reference: failed to get 'oci://<acr>/sre/charts/keda:2.12.1': failed to do request: Head "https://<acr>/v2/sre/charts/keda/manifests/2.12.1": EOF

I thought that this could be related to this issue about http_proxy on busybox images: https://github.com/mirror/busybox/issues/21 , and after that I tried with this docker image as source-controller:

FROM <acr>/sre/fluxcd/source-controller:v1.1.2
USER root

COPY zscaler.crt /etc/ssl/certs/
RUN update-ca-certificates

RUN apk --no-cache -U add openssl wget ca-certificates
# wget https://httpbin.org/get

USER 65534:65534

But I continued to receive the same error.

Do you guys have any idea of what I can do to fix this?

Valgueiro avatar May 13 '24 22:05 Valgueiro

Other things that can be useful here:

  1. The same setup works when I remove the firewall and proxy from the architecture.
  2. This is the output of the command when I try to do a HEAD request from the source-controller container
~ $ wget --spider https://<acr>/v2/sre/rancher-alerting-drivers/manifests/102.1.0
Spider mode enabled. Check if remote file exists.
--2024-05-13 22:07:41--  https://<acr>/v2/sre/rancher-alerting-drivers/manifests/102.1.0
Resolving proxy.com ( proxy.com)... <proxy-ip>
Connecting to proxy.com (proxy.com)|<proxy-ip>|:3128... connected.
Proxy request sent, awaiting response... 401 Unauthorized
  1. I tried to debug the code myself but I couldn't get much further. From what I could understand the error is popped from here: https://github.com/fluxcd/source-controller/blob/f8eea53bda618099f7f633ae289c8200b0cb3555/internal/helm/chart/builder_remote.go#L161 more specifically when calling Client.get https://github.com/fluxcd/source-controller/blob/f8eea53bda618099f7f633ae289c8200b0cb3555/internal/helm/repository/chart_repository.go#L281

Valgueiro avatar May 13 '24 22:05 Valgueiro

Just confirmed here with tcpdump that source-controller is sending requests directly to the OCI URL without using proxy. This should not be happening since the proxy is setup on the flux services like the doc suggests

Valgueiro avatar May 15 '24 19:05 Valgueiro

Can you please try with an OCIRepository and see if that works, example here https://fluxcd.io/blog/2024/05/flux-v2.3.0/#enhanced-helm-oci-support

stefanprodan avatar May 15 '24 19:05 stefanprodan

This is fixed in https://github.com/helm/helm/commit/94c1deae6d5a43491c5a4e8444ecd8273a8122a1 I believe. Upgrading helm to v3.15.0 in source-controller should resolve this

souleb avatar May 16 '24 07:05 souleb

Switching to OCIRepo and HelmRelease v2 should work as we don’t use the Helm getter in OCIRepo.

stefanprodan avatar May 16 '24 09:05 stefanprodan

I tried to just update to the latest flux version which uses a version of helm that was already fixed ( 1.3.0 source controller points to 3.14.4) but still maintaining the HelmRepository and I did not have success. I will give the OCIRepo a try.

Valgueiro avatar May 16 '24 12:05 Valgueiro

As I wrote above, it is fixed in helm v3.15.0. We have not updated Flux to that version yet. I would try Stefan suggestion on Flux v2.3.0.

souleb avatar May 16 '24 13:05 souleb

As I wrote above, the fix is already on flux version 2.3.0. Even the guy who made the fix himself bumped another repository to 3.14.4 to fix his issue. As you can see on the link to the code on 3.14.4, it is already there! Which means that 2.3.0 already have this fix.

image

So, bumping the version of helm 3.15 in the future must not solve the issue that I am facing.

Valgueiro avatar May 16 '24 14:05 Valgueiro

Thanks @Valgueiro, indeed we instantiate our own http.Transport. This will be fixed in the next Flux minor.

souleb avatar May 16 '24 22:05 souleb