amazon-vpc-cni-k8s icon indicating copy to clipboard operation
amazon-vpc-cni-k8s copied to clipboard

Recommended way to alert on inability to assign IPs to Pods

Open sidewinder12s opened this issue 2 years ago • 7 comments
trafficstars

What happened:

Is there a recommended way of alerting either using Kubernetes/CNI Helper/KSM/Node Exporter/cAdvisor metrics to alert when the CNI is unable to allocate IPs, runs out of IPs, etc?

We have had multiple incidents caused by the CNI either running out of IPs and being unable to allocate more out of the subnet or the CNI being unable to allocate IPs for unknown reasons and have had trouble IDing potential ways to alert on this as the only place I've seen explicit messages for what is wrong being logs off the CNI.

We need to be able to alert on this, due to a multitude of issues/migrations required we cannot use the recommended mitigations of IP Prefix Assignments or IPv6.

Environment:

  • Kubernetes version (use kubectl version): 1.25
  • CNI Version: v1.13.3
  • OS (e.g: cat /etc/os-release): EKS AMI v20230825
  • Kernel (e.g. uname -a): 5.10.186-179.751.amzn2

sidewinder12s avatar Nov 01 '23 18:11 sidewinder12s