amazon-vpc-cni-k8s
amazon-vpc-cni-k8s copied to clipboard
Recommended way to alert on inability to assign IPs to Pods
trafficstars
What happened:
Is there a recommended way of alerting either using Kubernetes/CNI Helper/KSM/Node Exporter/cAdvisor metrics to alert when the CNI is unable to allocate IPs, runs out of IPs, etc?
We have had multiple incidents caused by the CNI either running out of IPs and being unable to allocate more out of the subnet or the CNI being unable to allocate IPs for unknown reasons and have had trouble IDing potential ways to alert on this as the only place I've seen explicit messages for what is wrong being logs off the CNI.
We need to be able to alert on this, due to a multitude of issues/migrations required we cannot use the recommended mitigations of IP Prefix Assignments or IPv6.
Environment:
- Kubernetes version (use
kubectl version): 1.25 - CNI Version: v1.13.3
- OS (e.g:
cat /etc/os-release): EKS AMI v20230825 - Kernel (e.g.
uname -a): 5.10.186-179.751.amzn2