application-gateway-kubernetes-ingress
application-gateway-kubernetes-ingress copied to clipboard
httpGet-inferred health probes inspect irrelevant pods (completed/terminated)
Describe the bug Health probe inference looks at the "first pod", namely whatever pod is first in the namespace & service-selector filtered pod cache list: https://github.com/Azure/application-gateway-kubernetes-ingress/blob/80971d9a2654219c7ad40e506a7825289a607f97/pkg/k8scontext/context.go#L471
Unfortunately, this pod can be an irrelevant leftover from past deployments, typically a failed pod that was not garbage-collected (https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-garbage-collection
To Reproduce
- Manually fail some pods matching a service, and keep them around (don't delete the failed/completed pods)
- Change some probe-relevant property in the ingress of one of the relevant, running (non-failed) pods
- Depending on the order of the cached pod list mentioned above, health probes will sometimes take the property from a failed pod
Proposed Solution
- Filter out pods that are
Completed/Terminated: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle - Prioritize pods that are
Running, thenPending, thenUnknown - Then Prioritize pods whose containers are all
Running - Then prioritize pods by creation timestamp (newest first)
So basically something like:
podList.Where(pod => pod.State not in ["Completed", "Terminated"]).SortBy(pod => pod.State).SortBy(pod => pod.AllContainersRunning).SortBy(pod => pod.StartTime)[0]
Ingress Controller details This is behavior present in the current code: https://github.com/Azure/application-gateway-kubernetes-ingress/blob/80971d9a2654219c7ad40e506a7825289a607f97/pkg/appgw/health_probes.go#L253
Seems to be the same issue.
You update the health probe of a deployment and you have some evicted pods from the old deployment. The ingress controller will update the ingress with the old health probes. This took some hours of me before examining the AGIC code to see where it takes the probe information from.
Our issue has been solved by removing the evicted or completed pods. But this should be fixed in the code.