amazon-vpc-cni-k8s icon indicating copy to clipboard operation
amazon-vpc-cni-k8s copied to clipboard

Pod stuck in ContainerCreating after upgrading cluster to 1.29

Open zendesk-yumingdeng opened this issue 1 year ago • 10 comments
trafficstars

What happened:

We are experiencing something similar to https://github.com/aws/amazon-vpc-cni-k8s/issues/2970, after upgrading our in-house clusters to 1.29. After a new node is brought up (not this does not happen to every node), some pods that were scheduled to the node are stuck in the ContainerCreating status with the below event message:

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4e174229a28e7e3df61ece1a4320cc6581304664ea39186ab52281a283113a3a": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  • No error messages can be found in the aws-cni pod logs on the node
  • Not many details can be found in /var/log/aws-routed-eni/plugin.log
  • Found below errors in /var/log/aws-routed-eni/ipamd.log
{"level":"error","ts":"2024-07-05T04:28:59.406Z","caller":"eventrecorder/eventrecorder.go:67","msg":"Cached client failed GET pod (aws-cni-9w9vm)"}
{"level":"error","ts":"2024-07-05T04:28:59.406Z","caller":"aws-k8s-agent/main.go:63","msg":"Failed to find host aws-node pod: Pod \"aws-cni-9w9vm\" not found"}
{"level":"error","ts":"2024-07-05T04:31:02.334Z","caller":"datastore/data_store.go:652","msg":"DataStore has no available IP/Prefix addresses"}
{"level":"warn","ts":"2024-07-05T04:31:02.352Z","caller":"ipamd/rpc_handler.go:230","msg":"UnassignPodIPAddress: Failed to find sandbox _migrated-from-cri/16b5c95f6cb3ab32266a00048d184aff67a36d5ab730a4b3af3296b92ddff514/unknown"}
{"level":"warn","ts":"2024-07-05T04:34:37.660Z","caller":"ipamd/rpc_handler.go:230","msg":"UnassignPodIPAddress: Failed to find sandbox _migrated-from-cri/12808abe62be1050ba4da91b52e7339d4926e93ee9dc02989dc8653415610a5d/unknown"}

Environment:

  • Kubernetes version (use kubectl version): v1.29.6
  • CNI Version: v1.14.1
  • OS (e.g: cat /etc/os-release): Ubuntu 22.04.4 LTS
  • Kernel (e.g. uname -a): Linux 6.5.0-1022-aws #22~22.04.1-Ubuntu SMP Fri Jun 14 19:23:09 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

zendesk-yumingdeng avatar Jul 05 '24 07:07 zendesk-yumingdeng