application-gateway-kubernetes-ingress icon indicating copy to clipboard operation
application-gateway-kubernetes-ingress copied to clipboard

agic pod is not ready after upgraded aks cluster to 1.22

Open Shawn71 opened this issue 3 years ago • 3 comments

Describe the bug we just run the below command to enable the upgrade of AKS cluster to a stable version automatically. Actually when aks cluster version upgrades to 1.22 .then the agic pod status is not ready

E0505 10:47:21.279139 1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1beta1.Ingress: the server could not find the requested resource (get ingresses.extensions)

To Reproduce we just run the below command to enable the upgrade of AKS cluster to a stable version automatically. az aks update --resource-group myResourceGroup --name myAKSCluster --auto-upgrade-channel stable

Ingress Controller details

  • Output of kubectl describe pod <ingress controller> . The pod name can be obtained by running helm list. *
  • Name: agic-ingress-azure-56d44bc5f6-vhpwz Namespace: default Priority: 0 Node: aks-nodepool1-34854773-vmss000007/172.16.0.4 Start Time: Mon, 09 May 2022 17:33:46 +0800 Labels: app=ingress-azure pod-template-hash=56d44bc5f6 release=agic Annotations: checksum/config: 5e9453d67ca930c397d10a41863b86aa13c011a0d4ff1bdf96e81b7747d23027 prometheus.io/port: 8123 prometheus.io/scrape: true Status: Running IP: 172.16.0.18 IPs: IP: 172.16.0.18 Controlled By: ReplicaSet/agic-ingress-azure-56d44bc5f6 Containers: ingress-azure: Container ID: containerd://c9e8a62da0675197e175c96af0e55119a18ddf349cf6ce751d9821288fc6191b Image: mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.3.0 Image ID: mcr.microsoft.com/azure-application-gateway/kubernetes-ingress@sha256:28cbb5581b775523821742119f57b25dd584697b6c1c05c5ddeabf2fb59f37c7 Port: Host Port: State: Running Started: Mon, 09 May 2022 17:34:54 +0800 Ready: False Restart Count: 0 Liveness: http-get http://:8123/health/alive delay=15s timeout=1s period=20s #success=1 #failure=3 Readiness: http-get http://:8123/health/ready delay=5s timeout=1s period=10s #success=1 #failure=3 Environment Variables from: agic-cm-ingress-azure ConfigMap Optional: false Environment: AZURE_CLOUD_PROVIDER_LOCATION: /etc/appgw/azure.json AGIC_POD_NAME: agic-ingress-azure-56d44bc5f6-vhpwz (v1:metadata.name) AGIC_POD_NAMESPACE: default (v1:metadata.namespace) AZURE_AUTH_LOCATION: /etc/Azure/Networking-AppGW/auth/armAuth.json Mounts: /etc/Azure/Networking-AppGW/auth from networking-appgw-k8s-azure-service-principal-mount (ro) /etc/appgw/ from azure (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-l8tw7 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: azure: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/ HostPathType: Directory networking-appgw-k8s-azure-service-principal-mount: Type: Secret (a volume populated by a Secret) SecretName: networking-appgw-k8s-azure-service-principal Optional: false kube-api-access-l8tw7: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message
    Normal Scheduled Successfully assigned default/agic-ingress-azure-56d44bc5f6-vhpwz to aks-nodepool1-34854773-vmss000007 Normal Pulling 6m3s kubelet, aks-nodepool1-34854773-vmss000007 Pulling image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.3.0" Normal Pulled 4m56s kubelet, aks-nodepool1-34854773-vmss000007 Successfully pulled image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.3.0" in 1m6.76835833s Normal Created 4m56s kubelet, aks-nodepool1-34854773-vmss000007 Created container ingress-azure Normal Started 4m56s kubelet, aks-nodepool1-34854773-vmss000007 Started container ingress-azure Warning Unhealthy 63s (x26 over 4m43s) kubelet, aks-nodepool1-34854773-vmss000007 Readiness probe failed: Get "http://172.16.0.18:8123/health/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  • Output of `kubectl logs .
  • Any Azure support tickets associated with this issue.

is there any clue to fix this issues ? thanks in advance .

Shawn71 avatar May 10 '22 11:05 Shawn71

@ShawnBian Can you please share the AGIC logs ?

akshaysngupta avatar May 23 '22 21:05 akshaysngupta

We suffered this issue for version 1.4.0 And upgrade to latest version can fix it

ctmillerlin avatar Jun 21 '22 13:06 ctmillerlin

We were able to workaround this problem by deploying version 1.5.2 of the helm chart and setting the image tag to 1.5.2 (--set image.tag=1.5.2).

Should the following line default the image tag to 1.5.2, at least for the 1.5.2 chart? https://github.com/Azure/application-gateway-kubernetes-ingress/blob/6983c6bc68322a85e055f499b0180bfe3535c484/helm/ingress-azure/values.yaml#L14

markrzasa avatar Aug 11 '22 23:08 markrzasa