application-gateway-kubernetes-ingress icon indicating copy to clipboard operation
application-gateway-kubernetes-ingress copied to clipboard

Problems when referring to ports by name after upgrade to k8s 1.22.2 and kubernetes-ingress:1.5.0-rc1

Open bweitjensES opened this issue 3 years ago • 2 comments

Describe the bug

We just upgraded our k8s cluster, and are now using k8s version 1.22.2, which automatically upgraded to ingress controller version 1.5.0-rc1. After our next deploy, many routes were no longer reachable. After a little research we found this was caused by errors due to ingress configuration, causing endpoints to have missing rules in the application gateway.

After some further research we found this was likely caused by the use of named ports in the configuration. Once we replaced these by port numbers everything worked again as expected.

Now I wonder if we did have incorrect configuration, which by accident worked, or whether the use of named ports should work and is now broken.

In our k8s templates we have

  • a deployment with a port + name
  • a service with a targetPort referencing the deployment port name, a port number, and a name
  • ingress with a path using the name of the service, and the named port

I tried to reproduce the problem with a dummy project, and it seems the problem occurs when the services expose a port other than 80, and the ingress refers to this port by name.

To Reproduce Steps to reproduce the behavior:

Have a k8s cluster with AGIC enabled. We have an SSL certificate linked to the application gateway, not sure if this is relevant.

Afterwords apply the following k8s deployment, creating a namespace, 2 nginx deployments+services (using port 80, 1 referring to the port by name, the other by number), 2 mailhog deployments+services (using port 1025, 1 referring to the port by name, the other by number), and the ingress configuration. Note: host and appgw-ssl-certificate still need to be filled in.

apiVersion: v1
kind: Namespace
metadata:
  name: test
  labels:
    name: test
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: test
  name: nginx1
  labels:
    app: nginx1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx1
  template:
    metadata:
      labels:
        app: nginx1
    spec:
      nodeSelector:
        "kubernetes.io/os": linux
      containers:
        - name: nginx
          image: "nginx:latest"
          imagePullPolicy: "Always"
          ports:
            - containerPort: 80
              name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: test
  name: nginx2
  labels:
    app: nginx2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx2
  template:
    metadata:
      labels:
        app: nginx2
    spec:
      nodeSelector:
        "kubernetes.io/os": linux
      containers:
        - name: nginx
          image: "nginx:latest"
          imagePullPolicy: "Always"
          ports:
            - containerPort: 80
              name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: test
  name: mailhog1
  labels:
    app: mailhog1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mailhog1
  template:
    metadata:
      labels:
        app: mailhog1
    spec:
      nodeSelector:
        "kubernetes.io/os": linux
      containers:
        - name: mailhog1
          image: mailhog/mailhog:v1.0.1
          ports:
            - containerPort: 1025
              name: mailhogui
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: test
  name: mailhog2
  labels:
    app: mailhog2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mailhog2
  template:
    metadata:
      labels:
        app: mailhog2
    spec:
      nodeSelector:
        "kubernetes.io/os": linux
      containers:
        - name: mailhog2
          image: mailhog/mailhog:v1.0.1
          ports:
            - containerPort: 1025
              name: mailhogui
---
apiVersion: v1
kind: Service
metadata:
  namespace: test
  name: test-nginx1
spec:
  selector:
    app: nginx1
  ports:
    - name: httpport1
      port: 80
      targetPort: http
  type: ClusterIP
---
apiVersion: v1
kind: Service
metadata:
  namespace: test
  name: test-nginx2
spec:
  selector:
    app: nginx2
  ports:
    - name: httpport2
      port: 80
      targetPort: http
  type: ClusterIP
---
apiVersion: v1
kind: Service
metadata:
  namespace: test
  name: test-mailhog1
spec:
  selector:
    app: mailhog1
  ports:
    - name: mailhogport1
      port: 1025
      targetPort: mailhogui
  type: ClusterIP
---
apiVersion: v1
kind: Service
metadata:
  namespace: test
  name: test-mailhog2
spec:
  selector:
    app: mailhog2
  ports:
    - name: mailhogport2
      port: 1025
      targetPort: mailhogui
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  namespace: test
  name: test-ingress
  annotations:
    kubernetes.io/ingress.class: azure/application-gateway
    appgw.ingress.kubernetes.io/backend-path-prefix: /
    appgw.ingress.kubernetes.io/appgw-ssl-certificate: "..."
    appgw.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  rules:
    - host: "..."
      http:
        paths:
          - path: /agicbugrepro/test1
            pathType: Prefix
            backend:
              service:
                name: test-nginx1
                port:
                  name: httpport1
          - path: /agicbugrepro/test2
            pathType: Prefix
            backend:
              service:
                name: test-nginx2
                port:
                  number: 80
          - path: /agicbugrepro/mailhog1
            pathType: Prefix
            backend:
              service:
                name: test-mailhog1
                port:
                  name: mailhogport1
          - path: /agicbugrepro/mailhog2
            pathType: Prefix
            backend:
              service:
                name: test-mailhog2
                port:
                  number: 1025

Now the easiest way to see the problem is that in the application gateway there are health probes for all 4 routes, but in the backend health only 3 are listed (of which 1 is unhealthy, but that shouldn't be relevant, and I was too lazy to look into this further); the 2 nginx's using port 80, and the mailhog referring to port by number instead of name.

Ingress Controller details

  • Output of kubectl describe pod <ingress controller> .
Name:         ingress-appgw-deployment-748bf777fd-fhvpp
Namespace:    kube-system
Priority:     0
Node:         aks-userpool-23015071-vmss00002m/10.240.0.70
Start Time:   Mon, 01 Nov 2021 13:21:49 +0100
Labels:       app=ingress-appgw
              kubernetes.azure.com/managedby=aks
              pod-template-hash=748bf777fd
Annotations:  checksum/config: 77555f5764c44be7f3135437f1e1d790f2bc15c405552b92c220a76e2d517dc2
              cluster-autoscaler.kubernetes.io/safe-to-evict: true
              kubernetes.azure.com/metrics-scrape: true
              prometheus.io/path: /metrics
              prometheus.io/port: 8123
              prometheus.io/scrape: true
              resource-id: ...
Status:       Running
IP:           10.240.0.99
IPs:
  IP:           10.240.0.99
Controlled By:  ReplicaSet/ingress-appgw-deployment-748bf777fd
Containers:
  ingress-appgw-container:
    Container ID:   containerd://9e50a59c466c39cc2506a6a43a324203eaa587501aa003e8664b3c20823c2165
    Image:          mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.5.0-rc1
    Image ID:       sha256:31f24efe7ec67c0ae8bb0eba447f3c4a9449f3302b54ce6ed659c10d3d0e5f1b
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 01 Nov 2021 13:21:50 +0100
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     700m
      memory:  100Mi
    Requests:
      cpu:      100m
      memory:   20Mi
    Liveness:   http-get http://:8123/health/alive delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:  http-get http://:8123/health/ready delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      ingress-appgw-cm  ConfigMap  Optional: false
    Environment:
      AZURE_CLOUD_PROVIDER_LOCATION:  /etc/kubernetes/azure.json
      AGIC_POD_NAME:                  ingress-appgw-deployment-748bf777fd-fhvpp (v1:metadata.name)
      AGIC_POD_NAMESPACE:             kube-system (v1:metadata.namespace)
      KUBERNETES_PORT_443_TCP_ADDR:   developmentesaksdns-1-f2c90f48.hcp.westeurope.azmk8s.io
      KUBERNETES_PORT:                tcp://developmentesaksdns-1-f2c90f48.hcp.westeurope.azmk8s.io:443
      KUBERNETES_PORT_443_TCP:        tcp://developmentesaksdns-1-f2c90f48.hcp.westeurope.azmk8s.io:443
      KUBERNETES_SERVICE_HOST:        developmentesaksdns-1-f2c90f48.hcp.westeurope.azmk8s.io
    Mounts:
      /etc/kubernetes/azure.json from cloud-provider-config (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-76znn (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  cloud-provider-config:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/azure.json
    HostPathType:  File
  kube-api-access-76znn:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute for 300s
                             node.kubernetes.io/unreachable:NoExecute for 300s
Events:                      <none>
  • Output of `kubectl logs .

A lot, but the only relevant warnings/errors I could found where these:

Event(v1.ObjectReference{Kind:"Ingress", Namespace:"testns", Name:"test-ingress", UID:"...", APIVersion:"networking.k8s.io/v1", ResourceVersion:"103137761", FieldPath:""}): type: 'Warning' reason: 'BackendPortTargetMatch' Backend target port 80 does not have matching endpoint port

  • Any Azure support tickets associated with this issue. TicketNr: 2111100050002419

bweitjensES avatar Nov 10 '21 16:11 bweitjensES

Did you get any resolution on your ticket @bweitjensES ? Just upgraded our cluster and have the same issue.

shurth avatar Feb 03 '22 00:02 shurth

I am having a similar issue as well!

jmp601 avatar May 04 '22 03:05 jmp601