ENI network broken
What happened: The Pod which use the secondary ENI ip, cannot access internal SVC (e.g. kubernetes).
What you expected to happen: Pod can connect to Kubernetes internal SVC.
How to reproduce it (as minimally and precisely as possible):
- Use th lanuch template to run the node, please check my userdata as below :
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="//"
--//
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
set -ex
cat > /etc/kubernetes/nodeadm-bootstrap.yaml <<EOF
---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
cluster:
name: LAB-1-30
apiServerEndpoint: https://1234567890.yl4.cn-north-1.eks.amazonaws.com.cn
certificateAuthority: xxxx
cidr: 172.20.0.0/16
EOF
nodeadm init --config-source file:///etc/kubernetes/nodeadm-bootstrap.yaml
sleep 300 # I will install some software in this step in my prod environment, replace it by sleep command could also reproduce this issue also.
nodeadm init --config-source file:///etc/kubernetes/nodeadm-bootstrap.yaml
--//--
-
lanuch the instance
-
deploy the netshoot pod into the node :
apiVersion: apps/v1
kind: Deployment
metadata:
name: netshoot-test
spec:
replicas: 10
selector:
matchLabels:
app: netshoot-test
template:
metadata:
labels:
app: netshoot-test
spec:
nodeName: ip-x-x-x-x.cn-north-1.compute.internal
containers:
- name: netshoot
image: nicolaka/netshoot
imagePullPolicy: Always
command: ["/bin/bash", "-ce", "tail -f /dev/null"]
-
Check the pod which run on the secondary ENI , and telnet the kubernetes SVC : k exec -it netshoot-test-7bb8ff8fb-h5f4b -- /bin/bash netshoot-test-7bb8ff8fb-h5f4b:~# telnet kubernetes 443 telnet: bad address 'kubernetes'
-
But the Pod which run on the Primary ENI is working fine.
Environment:
- AWS Region: cn-north-1
- Instance Type(s): arm / t4g.medium
- Cluster Kubernetes version: 1.33
- Node Kubernetes version: 1.33
- AMI Version: standard-1.33-v20251023
VPC CNI : v1.20.4-eksbuild.1 Kube-Proxy : v1.33.3-eksbuild.10
I didnt use the custom network configuration or network policy, and just use the VPC-CNI default mode.
hey @BruceLuX , trying to understand the drive behind including this in your user data:
nodeadm init --config-source file:///etc/kubernetes/nodeadm-bootstrap.yaml
sleep 300 # I will install some software in this step in my prod environment, replace it by sleep command could also reproduce this issue also.
nodeadm init --config-source file:///etc/kubernetes/nodeadm-bootstrap.yaml
is there something in particular you're aiming to accomplish with the manual nodeadm execution? for clarity, we typical split nodeadm into two phases, once that goes before user data script execution and one that goes after - these are called config and run respectively. from the snippet you've shared, nothing seems dynamic or would require writing the config to disk, just leaving the config part like in this example would do the trick, you can also use our playground to validate beforehand.
I suspect removing the manual nodeadm execution will resolve this (or at the very least adding --skip run to the executions), but we can look into making this situation a bit more stable.
Hi @mselim00
Many thanks for your reply, sleep 300 is used to simulate downloading and installing monitoring software before node join the cluster. Actually we have already mitigated this issue by adjust the userdata, but what confuse me is why run the same nodeadm init command, these two commands were only 300 seconds apart, will causes the ENI networking broken.
glad you've found a mitigation! I believe what's happening here is that the nodeadm invocation is causing the instance to be joined to a cluster and a CNI to start on it that attaches an ENI before the process that's taking up ~5 minutes completes. this ENI is treated as system-managed (not CNI-managed) because it was attached prior to cloud-init completion (where user data scripts are executed). ordinarily, kubelet would not be started until after user data scripts all execute, so any CNI-attached interface would be done after cloud-init completes, and would be left to be managed by the CNI. we can certainly look into ways to improve this experience, but for now (and in general), I'd recommend to avoid executing nodeadm directly in user data