application-gateway-kubernetes-ingress
application-gateway-kubernetes-ingress copied to clipboard
Code="ErrorApplicationGatewayForbidden" Message="Unexpected status code '403' while performing a GET on Application Gateway
Describe the bug When a new aks cluster and appgw have been deployed and agic add-on has been enabled, it is not working as expected. I have created a couple of clusters (all in westeurope region) by following the article https://docs.microsoft.com/en-us/azure/application-gateway/tutorial-ingress-controller-add-on-existing, but I got this error in the logs (ingress controller pods').
E0909 21:30:44.450072 1 client.go:170] Code="ErrorApplicationGatewayForbidden" Message="Unexpected status code '403' while performing a GET on Application Gateway. You can use 'az role assignment create --role Reader --scope /subscriptions/e5c8b5e5-9c21-4fc3-81bc-4159bc41cd9b/resourceGroups/rg1-ozozturk --assignee c4a8000d-ffbd-4d68-b0d5-f2f448e65bc5; az role assignment create --role Contributor --scope /subscriptions/xxxxxxxxxxxx/resourceGroups/rg1-ozozturk/providers/Microsoft.Network/applicationGateways/appgw1-ozozturk --assignee xxxxxxxxxxxx' to assign permissions. AGIC Identity needs atleast has 'Contributor' access to Application Gateway 'appgw1-ozozturk' and 'Reader' access to Application Gateway's Resource Group 'rg1-ozozturk'." InnerError="network.ApplicationGatewaysClient#Get: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client 'xxxxxxxxxxxx' with object id 'xxxxxxxxxxxx' does not have authorization to perform action 'Microsoft.Network/applicationGateways/read' over scope '/subscriptions/xxxxxxxxxxxx/resourceGroups/rg1-ozozturk/providers/Microsoft.Network/applicationGateways/appgw1-ozozturk' or the scope is invalid. If access was recently granted, please refresh your credentials."" I0909 21:30:44.450091 1 retry.go:33] Retrying in 10s
Btw, If I follow what log says and assign 'Contributor' access to Application Gateway and 'Reader' access to Application Gateway's Resource Group for the managed identity, it works.
To Reproduce
- Follow the tutorial and create an aks cluster and an appgw. Enable agic and try to deploy an ingress object. it will not return any ip address and you'll see the error in the output of kubectl logs
. https://docs.microsoft.com/en-us/azure/application-gateway/tutorial-ingress-controller-add-on-existing
kubectl get ingress
NAME HOSTS ADDRESS PORTS AGE
aspnetapp * 80 5s
Ingress Controller details
- Output of
kubectl describe pod <ingress controller
> . Thepod name can be obtained by running helm list
.
Name: ingress-appgw-deployment-7d57c6dcd5-jv8mk
Namespace: kube-system
Priority: 0
Node: aks-nodepool1-62859232-vmss000001/10.240.0.35
Start Time: Wed, 09 Sep 2020 23:19:57 +0200
Labels: app=ingress-appgw
kubernetes.azure.com/managedby=aks
pod-template-hash=7d57c6dcd5
Annotations: checksum/config: 9cdc8550b8f315e4f99b0da58b9fd961977977f20253fcb0091dbb7c352f634d
kubernetes.azure.com/metrics-scrape: true
prometheus.io/path: /metrics
prometheus.io/port: 8123
prometheus.io/scrape: true
resource-id:
/subscriptions/e5c8b5e5-9c21-4fc3-81bc-4159bc41cd9b/resourceGroups/rg1-ozozturk/providers/Microsoft.ContainerService/managedClusters/aks1-...
Status: Running
IP: 10.240.0.43
IPs:
IP: 10.240.0.43
Controlled By: ReplicaSet/ingress-appgw-deployment-7d57c6dcd5
Containers:
ingress-appgw-container:
Container ID: docker://45824fb72a5112f9f917edf9c1572aeba6170c8064517393ab5f2d4872d745c1
Image: mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.2.0-rc3
Image ID: docker-pullable://mcr.microsoft.com/azure-application-gateway/kubernetes-ingress@sha256:9331411a7f29cfd49b6d93e045ea6dea44cab07a2b4ce7d6a6448b3eb23d5200
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 09 Sep 2020 23:20:09 +0200
Ready: True
Restart Count: 0
Limits:
cpu: 700m
memory: 100Mi
Requests:
cpu: 100m
memory: 20Mi
Liveness: http-get http://:8123/health/alive delay=15s timeout=1s period=20s #success=1 #failure=3
Readiness: http-get http://:8123/health/ready delay=5s timeout=1s period=10s #success=1 #failure=3
Environment Variables from:
ingress-appgw-cm ConfigMap Optional: false
Environment:
AZURE_CLOUD_PROVIDER_LOCATION: /etc/kubernetes/azure.json
AGIC_POD_NAME: ingress-appgw-deployment-7d57c6dcd5-jv8mk (v1:metadata.name)
AGIC_POD_NAMESPACE: kube-system (v1:metadata.namespace)
KUBERNETES_PORT_443_TCP_ADDR: aks1-ozozt-rg1-ozozturk-e5c8b5-aa3d2728.hcp.westeurope.azmk8s.io
KUBERNETES_PORT: tcp://aks1-ozozt-rg1-ozozturk-e5c8b5-aa3d2728.hcp.westeurope.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://aks1-ozozt-rg1-ozozturk-e5c8b5-aa3d2728.hcp.westeurope.azmk8s.io:443
KUBERNETES_SERVICE_HOST: aks1-ozozt-rg1-ozozturk-e5c8b5-aa3d2728.hcp.westeurope.azmk8s.io
Mounts:
/etc/kubernetes/azure.json from cloud-provider-config (ro)
/var/run/secrets/kubernetes.io/serviceaccount from ingress-appgw-sa-token-lgxfw (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
cloud-provider-config:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/azure.json
HostPathType: File
ingress-appgw-sa-token-lgxfw:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-appgw-sa-token-lgxfw
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 24m default-scheduler Successfully assigned kube-system/ingress-appgw-deployment-7d57c6dcd5-jv8mk to aks-nodepool1-62859232-vmss000001
Normal Pulling 24m kubelet, aks-nodepool1-62859232-vmss000001 Pulling image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.2.0-rc3"
Normal Pulled 24m kubelet, aks-nodepool1-62859232-vmss000001 Successfully pulled image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.2.0-rc3"
Normal Created 24m kubelet, aks-nodepool1-62859232-vmss000001 Created container ingress-appgw-container
Normal Started 24m kubelet, aks-nodepool1-62859232-vmss000001 Started container ingress-appgw-container
- Output of `kubectl logs
.
E0909 21:30:44.450072 1 client.go:170] Code="ErrorApplicationGatewayForbidden" Message="Unexpected status code '403' while performing a GET on Application Gateway. You can use 'az role assignment create --role Reader --scope /subscriptions/e5c8b5e5-9c21-4fc3-81bc-4159bc41cd9b/resourceGroups/rg1-ozozturk --assignee c4a8000d-ffbd-4d68-b0d5-f2f448e65bc5; az role assignment create --role Contributor --scope /subscriptions/xxxxxxxxxxxx/resourceGroups/rg1-ozozturk/providers/Microsoft.Network/applicationGateways/appgw1-ozozturk --assignee xxxxxxxxxxxx' to assign permissions. AGIC Identity needs atleast has 'Contributor' access to Application Gateway 'appgw1-ozozturk' and 'Reader' access to Application Gateway's Resource Group 'rg1-ozozturk'." InnerError="network.ApplicationGatewaysClient#Get: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client 'xxxxxxxxxxxx' with object id 'xxxxxxxxxxxx' does not have authorization to perform action 'Microsoft.Network/applicationGateways/read' over scope '/subscriptions/xxxxxxxxxxxx/resourceGroups/rg1-ozozturk/providers/Microsoft.Network/applicationGateways/appgw1-ozozturk' or the scope is invalid. If access was recently granted, please refresh your credentials."" I0909 21:30:44.450091 1 retry.go:33] Retrying in 10s
- Any Azure support tickets associated with this issue.
If you let az aks create the app gw it will work as it then gets deployed to the node resource group where aks has a service principal on the RG. If the GW is in another RG you will need to manually add the SP of AKS to the RG so it can manage the GW. Also you can not set a custom name for the node RG using az aks create as this will also break the integration.
Hope this gets sorted also.
@ozgurozturknet @mortenlerudjordet thanks for reporting. We will investigate this shortly.
Just a note, AGIC addon is in public preview.
I have encountered the same problem while testing this addon. Looking forward to the fix.
Hi @akshaysngupta any update on this issue?
Any updates on this issue?
would like to know if there is a fix for this as well
If that can help shed some light on the issue, I had the same ErrorApplicationGatewayForbidden
error after creating a AKS cluster with bicep scripts.
In my case the problem that I assigned the Reader and Contributor roles to the AKS cluster identity, whereas the role assignment should be set for the AGIC (application gateway ingress controller) managed identity, which is different and automatically created within the managed resource group where VMs sits (something like MC_your-aks-cluster-name).
So if you run into this, make sure you assigned the role to the right identity. If you use Bicep templates, this is effectively:
resource aksCluster 'Microsoft.ContainerService/managedClusters@2021-07-01' = {
name: 'my-cluster'
//... rest of your cluster definition, including ingressApplicationGateway addon
}
var agicPrincipalId = aksCluster.properties.addonProfiles.ingressApplicationGateway.identity.objectId
@akshaysngupta Any updates or progress on this?
In the same vein as @Marchelune I was able to get this working in my terraform code with the following role assignments:
resource "azurerm_role_assignment" "gateway-reader-role" {
principal_id = azurerm_kubernetes_cluster.multi_tenant_cluster.ingress_application_gateway[0].ingress_application_gateway_identity[0].object_id
role_definition_name = "Reader"
scope = format(
"/subscriptions/%s/resourcegroups/%s",
data.azurerm_client_config.current.subscription_id,
azurerm_resource_group.ecosystem_group.name
)
skip_service_principal_aad_check = true
}
resource "azurerm_role_assignment" "gateway-contributor-role" {
principal_id = azurerm_kubernetes_cluster.multi_tenant_cluster.ingress_application_gateway[0].ingress_application_gateway_identity[0].object_id
role_definition_name = "Contributor"
scope = format(
"/subscriptions/%s/resourcegroups/%s/providers/Microsoft.Network/applicationGateways/%s",
data.azurerm_client_config.current.subscription_id,
azurerm_resource_group.ecosystem_group.name,
azurerm_application_gateway.aks_gateway.name
)
The above didn't work for me :(
New error is: │ The given key does not identify an element in this collection value: the collection has no elements.
any updates on this error
I ended up changing my approach and using the service principal with 1.7.0 as someone has suggested in another thread and that seemed to get me past this problem.
Mine worked when i gave all the managed identities in the resource group that is usually created when you create a cluster contributor role