istio.io
istio.io copied to clipboard
Enabling Istio Ambient Service Mesh on a kubernetes namespace is making Redis Replica pods in CrashLoopBackOff state
Hi,
Installed istio service mesh in ambient mode version: 1.24.2 in GKE cluster.
-
After enabling it on the namespace, I am unable to get all the pods up and running. Noticing redis replica pods are in CrashLoopBackOff state.
-
In ztunnel logs, able to see error from almost all the pods that are deployed in cluster including redis. Below are some examples of error:
error access connection complete src.addr=x.x.x.x:60704 src.workload="planeta-scass-scanner-medium-6645898d5b-47rsk" src.namespace="planeta-scan-service" src.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-scanner" dst.addr=x.x.x.x:15008 dst.hbone_addr=x.x.x.x:5672 dst.service="planeta-scass-rabbitmq.planeta-scan-service.svc.cluster.local" dst.workload="planeta-scass-rabbitmq-server-2" dst.namespace="planeta-scan-service" dst.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-rabbitmq-server" direction="outbound" bytes_sent=0 bytes_recv=0 duration="10001ms" error="http status: 503 Service Unavailable"
error access connection complete src.addr=x.x.x.x:50348 src.workload="planeta-scass-redis-replicas-0" src.namespace="planeta-scan-service" src.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-redis-replica" dst.addr=x.x.x.x:15008 dst.hbone_addr=x.x.x.x:6379 dst.service="planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local" dst.workload="planeta-scass-redis-master-0" dst.namespace="planeta-scan-service" dst.identity="spiffe://cluster.local/ns/planeta-scan-service/sa/planeta-scass-redis-master" direction="inbound" bytes_sent=0 bytes_recv=0 duration="10001ms" error="connection failed: deadline has elapsed"
- There is an existing networkpolicy for redis as part of helm chart installation:
egress:
- ports:
- port: 15008
protocol: TCP
- port: 6379
protocol: TCP
ingress:
- ports:
- port: 15008
protocol: TCP
- port: 6379
policyTypes:
- Ingress
- Egress
Redis replica pod shows below error:
Unable to connect to MASTER: Resource temporarily unavailable
Connecting to MASTER planeta-scass-redis-master-0.planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local:6379
In case, I remove "Egress" from networkPolicy policyTypes, redis replica pod is still in CrashLoopBackOff, but in logs, I can see:
Connecting to MASTER planeta-scass-redis-master-0.planeta-scass-redis-headless.planeta-scan-service.svc.cluster.local:6379
* MASTER <-> REPLICA sync started
Need help in fixing ztunnel network connection to redis.