argo-cd icon indicating copy to clipboard operation
argo-cd copied to clipboard

argocd-redis secret-init initcontainer timeout

Open ziouf opened this issue 1 year ago • 10 comments

Describe the bug

secret-init initContainer is failing to check/init argocd/argocd-redis secret

To Reproduce

kubectl apply -k https://github.com/argoproj/argo-cd/raw/master/manifests/install.yaml

Expected behavior

InitContainer should succeed to check/init secret argocd/argocd-redis

Screenshots

Version

$ argocd version
argocd: v2.11.0+bc53266
  BuildDate: 2024-05-21T22:20:05Z
  GitCommit: bc53266591b632f1a1639ae458f31467446ffe48
  GitTreeState: clean
  GoVersion: go1.22.1
  Compiler: gc
  Platform: linux/amd64

Logs

$ kubectl describe pod/argocd-redis-565687fb7d-68xdd -n argocd 
Name:             argocd-redis-565687fb7d-68xdd
Namespace:        argocd
Priority:         0
Service Account:  argocd-redis
Node:             REDACTED
Start Time:       Wed, 22 May 2024 09:16:31 +0200
Labels:           app.kubernetes.io/name=argocd-redis
                  pod-template-hash=565687fb7d
Annotations:      cni.projectcalico.org/containerID: faff595beac8d993968609af65d1e133d01fa9b970a0e482301ef9b1b55e0b15
                  cni.projectcalico.org/podIP: 100.64.182.116/32
                  cni.projectcalico.org/podIPs: 100.64.182.116/32
                  kubectl.kubernetes.io/restartedAt: 2024-05-14T08:33:16+02:00
Status:           Pending
SeccompProfile:   RuntimeDefault
IP:               100.64.182.116
IPs:
  IP:           100.64.182.116
Controlled By:  ReplicaSet/argocd-redis-565687fb7d
Init Containers:
  secret-init:
    Container ID:    containerd://2e686a331f4f9a483160e70875f0057734a85a63065e48841b974ab5160957ee
    Image:           quay.io/argoproj/argocd:latest
    Image ID:        quay.io/argoproj/argocd@sha256:717a945c52f15cef5659b94bba3ab360f5a7b86685a978ace448f76f71063231
    Port:            <none>
    Host Port:       <none>
    SeccompProfile:  RuntimeDefault
    Command:
      argocd
      admin
      redis-initial-password
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    20
      Started:      Wed, 22 May 2024 09:25:21 +0200
      Finished:     Wed, 22 May 2024 09:25:51 +0200
    Ready:          False
    Restart Count:  6
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nftnx (ro)
Containers:
  redis:
    Container ID:  
    Image:         docker.io/library/redis:7.0.15-alpine
    Image ID:      
    Port:          6379/TCP
    Host Port:     0/TCP
    Args:
      --save
      
      --appendonly
      no
      --requirepass $(REDIS_PASSWORD)
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      REDIS_PASSWORD:  <set to the key 'auth' in secret 'argocd-redis'>  Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nftnx (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-nftnx:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors: 
Tolerations:         node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  12m                   default-scheduler  Successfully assigned argocd/argocd-redis-565687fb7d-68xdd to REDACTED
  Normal   Pulled     9m23s (x5 over 12m)   kubelet            Container image "quay.io/argoproj/argocd:latest" already present on machine
  Normal   Created    9m23s (x5 over 12m)   kubelet            Created container secret-init
  Normal   Started    9m23s (x5 over 12m)   kubelet            Started container secret-init
  Warning  BackOff    2m45s (x31 over 11m)  kubelet            Back-off restarting failed container secret-init in pod argocd-redis-565687fb7d-68xdd_argocd(e00bcbf4-853f-44fe-8302-82f0997884a9)


$ kubectl logs pod/argocd-redis-565687fb7d-68xdd -c secret-init  -n argocd
Checking for initial Redis password in secret argocd/argocd-redis at key auth. 
time="2024-05-22T07:25:51Z" level=fatal msg="Post \"https://10.32.0.1:443/api/v1/namespaces/argocd/secrets\": dial tcp 10.32.0.1:443: i/o timeout"

ziouf avatar May 22 '24 07:05 ziouf

I also encountered the same phenomenon by following stable, installed v2.11.0 and rolled back.

kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/v2.11.0/manifests/install.yaml 

otakakot avatar May 22 '24 07:05 otakakot

Same here, also rolled back to 2.11.0

otherguy avatar May 22 '24 08:05 otherguy

Same here

NAVRockClimber avatar May 22 '24 08:05 NAVRockClimber

Hi, The issue is that Argocd tries to create a secret for Redis via Kubernetes API. the current network policy does not allow Argocd Redis sever to even contact the Kubernetes API. as a workaround we temporarily changed the network policy for redis: argocd-redis-network-policy: and changed to pod selector: podSelector: matchLabels: app.kubernetes.io/name: argocd-redis

change the selector to something like app.kubernetes.io/name: argocd-redis-tmp

after that delete the Redis pod, it will now be able to create the secret and the system will start running. once everything is running, revert the network policy selector (remove "-tmp").

liron-telemessage avatar May 22 '24 09:05 liron-telemessage

Download install.yaml and add Kubernetes API port (in my case 16443) to network policy:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: argocd-redis-network-policy
spec:
  egress:
    - ports:
        - port: 53
          protocol: UDP
        - port: 53
          protocol: TCP
        - port: 16443
          protocol: TCP

Tomasz-Marciniak avatar May 22 '24 09:05 Tomasz-Marciniak

same here, fixed by https://github.com/argoproj/argo-cd/pull/18358

yyzxw avatar May 22 '24 10:05 yyzxw

I have same issue with argocd-redis-ha-haproxy

RispyCZ avatar May 22 '24 10:05 RispyCZ

We faced the same problem and also patched the argocd-redis-ha-proxy-network-policy NetworkPolicy as workaround:

  - ports:
    - port: 443
      protocol: TCP

ftmiro avatar May 22 '24 11:05 ftmiro

In my case I had to use v2.11.0 and also update NetworkPolicy as above mentioned.

ojasgo avatar May 23 '24 05:05 ojasgo

i had the same issue and patched the argocd-redis-network-policy Networkpolicy as @Tomasz-Marciniak suggested.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: argocd-redis-network-policy
spec:
  egress:
    - ports:
        - port: 53
          protocol: UDP
        - port: 53
          protocol: TCP
        - port: 6443 # your kubernetes api port
          protocol: TCP
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: argocd-server
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: argocd-repo-server
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: argocd-application-controller
      ports:
        - port: 6379
          protocol: TCP
  podSelector:
    matchLabels:
      app.kubernetes.io/name: argocd-redis
  policyTypes:
    - Ingress
    - Egress
    

ngaxavi avatar May 23 '24 09:05 ngaxavi

I think I had the same issue on 2.11.1 (after an upgrade from 2.10) and the upgrade to 2.11.2 fixed it for me.

cnd4 avatar May 28 '24 07:05 cnd4

Fixed in 2.11.2

pasha-codefresh avatar May 28 '24 08:05 pasha-codefresh

issue persist in helm install

sirTangale avatar Jul 07 '24 11:07 sirTangale

issue persist in helm install

@sirTangale Can you raise an issue in argo-helm, please? 😃

Also I'd like to mention that the helm chart is handling it slightly different (via a helm hook). Before Helm will deploy the core components of Argo CD (server / repo-server / ...), Helm will wait until the Secret is generated (the Job runs to completion without errors).

I highly appreciate detailed steps to reproduce.

mkilchhofer avatar Jul 10 '24 12:07 mkilchhofer

Issue still persists in 2.12.6

bestrocker221 avatar Oct 26 '24 15:10 bestrocker221

hello

same isssue on GKE with argocd-server:

W1129 16:07:32.309506 7 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1.ConfigMap: configmaps is forbidden: User "system:serviceaccount:argocd:argocd-server" cannot list resource "configmaps" in API group "" in the namespace "argocd" E1129 16:07:32.309849 7 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps is forbidden: User "system:serviceaccount:argocd:argocd-server" cannot list resource "configmaps" in API group "" in the namespace "argocd"

i tried to add egresse rule for argocd-repo-server-network-policy but it did not work!

DjarallahBrahim avatar Nov 29 '24 16:11 DjarallahBrahim

issue persist in helm install

this is still happening and is very annoying:

$ kubectl get events -n argocd --sort-by=.metadata.creationTimestamp | tail -n 20
22m         Normal    Created                   pod/argocd-redis-secret-init-f67b6           Created container secret-init
22m         Normal    Started                   pod/argocd-redis-secret-init-f67b6           Started container secret-init
22m         Normal    Created                   pod/argocd-redis-secret-init-f67b6           Created container istio-proxy
22m         Normal    Started                   pod/argocd-redis-secret-init-f67b6           Started container istio-proxy
22m         Normal    Pulled                    pod/argocd-redis-secret-init-f67b6           Container image "quay.io/argoproj/argocd:v2.12.4" already present on machine
13m         Normal    Killing                   pod/argocd-redis-secret-init-f67b6           Stopping container istio-proxy
13m         Warning   Unhealthy                 pod/argocd-redis-secret-init-f67b6           Readiness probe failed: HTTP probe failed with statuscode: 503
13m         Normal    Scheduled                 pod/argocd-redis-secret-init-66d4g           Successfully assigned argocd/argocd-redis-secret-init-66d4g to ip-10-1-3-138.us-gov-east-1.compute.internal
13m         Normal    Created                   pod/argocd-redis-secret-init-66d4g           Created container istio-validation
13m         Normal    SuccessfulCreate          job/argocd-redis-secret-init                 Created pod: argocd-redis-secret-init-66d4g
13m         Normal    Pulled                    pod/argocd-redis-secret-init-66d4g           Container image "docker.io/istio/proxyv2:1.19.4" already present on machine
13m         Normal    Started                   pod/argocd-redis-secret-init-66d4g           Started container istio-validation
13m         Normal    Pulling                   pod/argocd-redis-secret-init-66d4g           Pulling image "quay.io/argoproj/argocd:v2.14.8"
13m         Normal    Created                   pod/argocd-redis-secret-init-66d4g           Created container istio-proxy
13m         Normal    Pulled                    pod/argocd-redis-secret-init-66d4g           Container image "docker.io/istio/proxyv2:1.19.4" already present on machine
13m         Normal    Started                   pod/argocd-redis-secret-init-66d4g           Started container secret-init
13m         Normal    Created                   pod/argocd-redis-secret-init-66d4g           Created container secret-init
13m         Normal    Pulled                    pod/argocd-redis-secret-init-66d4g           Successfully pulled image "quay.io/argoproj/argocd:v2.14.8" in 7.700049271s (7.700060673s including waiting)
13m         Normal    Started                   pod/argocd-redis-secret-init-66d4g           Started container istio-proxy
13m         Normal    Pulled                    pod/argocd-redis-secret-init-66d4g           Container image "quay.io/argoproj/argocd:v2.14.8" already present on machine

kubectl get jobs -n argocd
NAME                       COMPLETIONS   DURATION   AGE
argocd-redis-secret-init   0/1           5m15s      5m15s

zhaque44 avatar Apr 01 '25 23:04 zhaque44

...
13m         Normal    Scheduled                 pod/argocd-redis-secret-init-66d4g           Successfully assigned argocd/argocd-redis-secret-init-66d4g to ip-10-1-3-138.us-gov-east-1.compute.internal
13m         Normal    Created                   pod/argocd-redis-secret-init-66d4g           Created container istio-validation
13m         Normal    SuccessfulCreate          job/argocd-redis-secret-init                 Created pod: argocd-redis-secret-init-66d4g
13m         Normal    Pulled                    pod/argocd-redis-secret-init-66d4g           Container image "docker.io/istio/proxyv2:1.19.4" already present on machine
13m         Normal    Started                   pod/argocd-redis-secret-init-66d4g           Started container istio-validation
...

@zhaque44

  1. you are using the Helm chart
  2. you are using Istio with sidecar injection

Can you open an issue with detailed reprod steps (including some information about your cluster/Istio setup) at the argo-helm repo over there: https://github.com/argoproj/argo-helm/issues/new/choose?

We will try you help you, but we need to be able to reproduce it with a simple setup (e.g. in a KinD environment).

mkilchhofer avatar Apr 02 '25 10:04 mkilchhofer

Strangely enough - I had the same problem last night -- I am bootstrapping 2 clusters - one on x86 and one on arm64. I used the exact same scripts to set them up - but admittedly - there can be some slight differences in the underlying OS, etc due to one being Raspbian (Debian Bookworm) and the other being ubuntu 24.04.2. The x86 version worked like a champ with the same Helm version but the arm64 version hits this snag. I can create a new bug...but I am not using istio ...

jurlwin avatar Apr 02 '25 12:04 jurlwin

...
13m         Normal    Scheduled                 pod/argocd-redis-secret-init-66d4g           Successfully assigned argocd/argocd-redis-secret-init-66d4g to ip-10-1-3-138.us-gov-east-1.compute.internal
13m         Normal    Created                   pod/argocd-redis-secret-init-66d4g           Created container istio-validation
13m         Normal    SuccessfulCreate          job/argocd-redis-secret-init                 Created pod: argocd-redis-secret-init-66d4g
13m         Normal    Pulled                    pod/argocd-redis-secret-init-66d4g           Container image "docker.io/istio/proxyv2:1.19.4" already present on machine
13m         Normal    Started                   pod/argocd-redis-secret-init-66d4g           Started container istio-validation
...

@zhaque44

  1. you are using the Helm chart
  2. you are using Istio with sidecar injection

Can you open an issue with detailed reprod steps (including some information about your cluster/Istio setup) at the argo-helm repo over there: https://github.com/argoproj/argo-helm/issues/new/choose?

We will try you help you, but we need to be able to reproduce it with a simple setup (e.g. in a KinD environment).

hey all this worked:

redisSecretInit:
  enabled: true
  podAnnotations:
    sidecar.istio.io/inject: "false"

zhaque44 avatar Apr 02 '25 14:04 zhaque44