istio.io icon indicating copy to clipboard operation
istio.io copied to clipboard

Enabling Istio Ambient Service Mesh on a kubernetes namespace is making Redis Replica pods in CrashLoopBackOff state

Open snps-tanvik opened this issue 8 months ago • 2 comments

Hi,

Installed istio service mesh in ambient mode version: 1.24.2 in GKE cluster.

  1. After enabling it on the namespace, I am unable to get all the pods up and running. Noticing redis replica pods are in CrashLoopBackOff state.

  2. In ztunnel logs, able to see error from almost all the pods that are deployed in cluster including redis. Below are some examples of error:

error access connection complete src.addr=x.x.x.x:60704 src.workload="planeta-scass-scanner-medium-6645898d5b-47rsk" src.namespace="planeta-scan-service" src.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-scanner" dst.addr=x.x.x.x:15008 dst.hbone_addr=x.x.x.x:5672 dst.service="planeta-scass-rabbitmq.planeta-scan-service.svc.cluster.local" dst.workload="planeta-scass-rabbitmq-server-2" dst.namespace="planeta-scan-service" dst.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-rabbitmq-server" direction="outbound" bytes_sent=0 bytes_recv=0 duration="10001ms" error="http status: 503 Service Unavailable"

error access connection complete src.addr=x.x.x.x:50348 src.workload="planeta-scass-redis-replicas-0" src.namespace="planeta-scan-service" src.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-redis-replica" dst.addr=x.x.x.x:15008 dst.hbone_addr=x.x.x.x:6379 dst.service="planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local" dst.workload="planeta-scass-redis-master-0" dst.namespace="planeta-scan-service" dst.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-redis-master" direction="inbound" bytes_sent=0 bytes_recv=0 duration="10001ms" error="connection failed: deadline has elapsed"

  1. There is an existing networkpolicy for redis as part of helm chart installation:
  egress:
  - ports:
    - port: 15008
      protocol: TCP
    - port: 6379
      protocol: TCP
  ingress:
  - ports:
    - port: 15008
      protocol: TCP
    - port: 6379
  policyTypes:
  - Ingress
  - Egress


Redis replica pod shows below error:

Unable to connect to MASTER: Resource temporarily unavailable
Connecting to MASTER planeta-scass-redis-master-0.planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local:6379

In case, I remove "Egress" from networkPolicy policyTypes,  redis replica pod is still in CrashLoopBackOff, but in logs, I can see:

Connecting to MASTER planeta-scass-redis-master-0.planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local:6379
 * MASTER <-> REPLICA sync started





Need help in fixing ztunnel network connection to redis.

snps-tanvik avatar Feb 07 '25 22:02 snps-tanvik