application-gateway-kubernetes-ingress icon indicating copy to clipboard operation
application-gateway-kubernetes-ingress copied to clipboard

1.22 clusters uses the 1.5.0-rc1 version instead of the final 1.5.0

Open davidkarlsen opened this issue 2 years ago • 4 comments

Describe the bug It seems strange to run RCs instead of final versions.

To Reproduce Steps to reproduce the behavior: Install a v 1.21.7 cluster, then upgrade to 1.22.4 Observe that the ingress controller is upgraded from 1.4 to 1.5.0-rc1

Ingress Controller details

  • Output of kubectl describe pod <ingress controller> . The pod name can be obtained by running helm list.
ode-Selectors:              <none>
Tolerations:                 :NoExecute op=Exists
                             :NoSchedule op=Exists
                             CriticalAddonsOnly op=Exists
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  27s   default-scheduler  Successfully assigned kube-system/ingress-appgw-deployment-69b856df-kq279 to aks-nodepool-35694215-vmss000000
  Normal  Pulled     27s   kubelet            Container image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.5.0-rc1" already present on machine
  Normal  Created    27s   kubelet            Created container ingress-appgw-container
  Normal  Started    27s   kubelet            Started container ingress-appgw-container
vmadmin@nic-dts-dev-kigx-vm-0:~$ 

davidkarlsen avatar Feb 05 '22 22:02 davidkarlsen

Does AGIC works for you with 1.22?

sec avatar Feb 09 '22 16:02 sec

yes, it works. we have the appgw in another vnet than the cluster, so it only works when running azure-cni, not with kubenet. But that's not related to the version.

davidkarlsen avatar Feb 09 '22 17:02 davidkarlsen

@sec the addon is broken in 1.22.

1.22 removes v1beta1.Ingress but the version the addon uses in 1.22 is 1.5.0-rc this version of the agic image tries to watch for v1beta1.Ingress which causes the issue in #1342. When I manually changed the image to 1.5.0 the v1beta1.Ingress issue goes away but the service account isn't correctly configured I get the error

Failed to watch *v1.IngressClass: failed to list *v1.IngressClass: ingressclasses.networking.k8s.io is forbidden: User "system:serviceaccount:kube-system:ingress-appgw-sa" cannot list resource "ingressclasses" in API group "networking.k8s.io" at the cluster scope

avbanks avatar Feb 14 '22 17:02 avbanks

Is this issue getting any attention?

Upgraded a cluster and facing the exact same issue described by avbanks. Manually changing the image from 1.5.0-rc to 1.5.1 now results in the same 'forbidden' error for the appgw service account. Editing the cluster role to include get/list/watch for ingressclasses under the api group networking.k8s.io fixes the issue entirely and restores the health of the backends on the app gateway.

Problem is that this config is automatically being reconciled/overwritten on a schedule/trigger. We are using the native AGIC integration for AKS and are not deploying the helm charts. Looking at the helm charts it looks like this is fixed in 1.5.1, but the native AGIC integration is not using this version yet...

Edit: Disabled the AGIC integration through ARM and set it up manually through Helm with the 1.5.1 version and now it works as expected.

jan-delaet avatar Mar 10 '22 19:03 jan-delaet