application-gateway-kubernetes-ingress
application-gateway-kubernetes-ingress copied to clipboard
AGIC with Lets encrypt sometimes serves old SSL certificate (sometimes, sporadically)
Please don't spend much time debugging this but i want to know if this is a known issue?
Describe the bug
Sometimes an enduser is served an old ssl certificate (way older)

To Reproduce Steps to reproduce the behavior:
Cert manager v0.15.1
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ingress-app-gateway
namespace: prod
annotations:
kubernetes.io/ingress.class: azure/application-gateway
appgw.ingress.kubernetes.io/ssl-redirect: "true"
appgw.ingress.kubernetes.io/request-timeout: "300"
appgw.ingress.kubernetes.io/connection-draining: "true"
appgw.ingress.kubernetes.io/connection-draining-timeout: "30"
cert-manager.io/cluster-issuer: issuer-letsencrypt-prod
cert-manager.io/acme-challenge-type: http01
spec:
tls:
- hosts:
- domain.me
secretName: secret-ssl-domain-me
- hosts:
- customer1.domain.me
secretName: secret-ssl-customer1-domain-me
- hosts:
- customer2.domain.me
secretName: secret-ssl-customer2-domain-me
- hosts:
- customer3.domain.me
secretName: secret-ssl-customer3-domain-me
- hosts:
- customer4.domain.me
secretName: secret-ssl-customer4-domain-me
rules:
- host: domain.me
http:
paths:
- backend:
serviceName: service-php-prod
servicePort: 80
- host: customer1.domain.me
http:
paths:
- backend:
serviceName: service-php-prod
servicePort: 80
- host: customer2.domain.me
http:
paths:
- backend:
serviceName: service-php-prod
servicePort: 80
- host: customer3.domain.me
http:
paths:
- backend:
serviceName: service-php-prod
servicePort: 80
- host: customer4.domain.me
http:
paths:
- backend:
serviceName: service-php-prod
servicePort: 80
Ingress Controller details
Name: ingress-azure-64b95964d8-ndnvf
Namespace: default
Priority: 0
Node: aks-nodepool1-30476719-vmss000001/192.168.1.5
Start Time: Wed, 23 Sep 2020 10:29:54 +0200
Labels: aadpodidbinding=ingress-azure
app=ingress-azure
pod-template-hash=64b95964d8
release=ingress-azure
Annotations: checksum/config: d13d8bd8adaf32da021553a8cb42d3f750cd00fba8c1eb09012aed162268257d
prometheus.io/port: 8123
prometheus.io/scrape: true
Status: Running
IP: 10.244.1.5
IPs:
IP: 10.244.1.5
Controlled By: ReplicaSet/ingress-azure-64b95964d8
Containers:
ingress-azure:
Container ID: docker://116e54b531921f64384a0380541934b23b3b4ac75c1cb3e69b50cdc5b62ff7cb
Image: mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.2.0
Image ID: docker-pullable://mcr.microsoft.com/azure-application-gateway/kubernetes-ingress@sha256:de458f962eab0cd2de19d23dfeb9a0e4bc2565a38f8c45cc98a74f3cda8b940c
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 23 Sep 2020 10:30:35 +0200
Ready: True
Restart Count: 0
Liveness: http-get http://:8123/health/alive delay=15s timeout=1s period=20s #success=1 #failure=3
Readiness: http-get http://:8123/health/ready delay=5s timeout=1s period=10s #success=1 #failure=3
Environment Variables from:
ingress-azure ConfigMap Optional: false
Environment:
AZURE_CLOUD_PROVIDER_LOCATION: /etc/appgw/azure.json
AGIC_POD_NAME: ingress-azure-64b95964d8-ndnvf (v1:metadata.name)
AGIC_POD_NAMESPACE: default (v1:metadata.namespace)
Mounts:
/etc/appgw/azure.json from azure (rw)
/var/run/secrets/kubernetes.io/serviceaccount from ingress-azure-token-xfkxt (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
azure:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/azure.json
HostPathType: File
ingress-azure-token-xfkxt:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-azure-token-xfkxt
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
@joelharkes are you still seeing this problem ? Do you AGIC updating the AppGateway with old certificate when old certificate is served ?
Thanks for the update.
Haven't checked app gateway I will next time.
I think it happens around certificate update. (Maybe it reverts to previous certificate for few seconds) or something?
Normally it lasts only a few seconds when our users refresh it's gone.
Haven't heard a new report last 2 weeks but before then we heard it quite a few times and experienced it ourselves. (Our app only has 10.000 infrequent users currently)
I think it could also happen after I update the ingress file to add a new customer.
@akshaysngupta We just have had this issue today again multiple times.
it might seem to happen on updating ingress yml files. for context we have 3 different ingress yml files in 3 different namespaces equal to the one above but just with different sub-domains. (yes each domain is unique, I double checked this).
how can i check the certificate in app gateway? i see its setup but i only get a name, eg: test-secret-sss-customer-domain-me nothing more.
Use the following command to view the certificate in text using openssl.
resourceGroup=""
gatewayName=""
sslCertName=""
publiccert=$(az network application-gateway ssl-cert show -g $resourceGroup --gateway-name $gatewayName --name $sslCertName --query publicCertData -o tsv)
echo -e "-----BEGIN CERTIFICATE-----\n$publiccert\n-----END CERTIFICATE-----" | openssl pkcs7 -print_certs | openssl x509 -noout
Can you also check the k8s secret when this happens ?
crazy enough it's a very old certificate (i think it's the first certificate ever requested). It keeps coming back, either when we change the configuration or when a renewal has to be done.

I'm facing the same issue. Is there any workaround for this?
we don't seem to have this problem anymore. somehow it was fixed.
We did have some wrong IPv6 DNS records. but im not sure anymore if this was also the impact here.