amazon-eks-ami
amazon-eks-ami copied to clipboard
Iptables rules in kubelet.service
Hello,
I was trying to understand the need for having this iptables rule here. https://github.com/awslabs/amazon-eks-ami/blob/master/files/kubelet.service#L8
If this was introduced to fix docker trying to change rule from ACCEPT to DROP then I think this was indeed fixed in https://github.com/kubernetes/kubernetes/issues/59656 and that line can be removed to avoid further confusions.
I do think we should verify this. The change seems to be
https://github.com/kubernetes/kubernetes/pull/62007/files#diff-97649d13432b01b4e7669d728a2e2e82R1324-R1329
We are noticing that kubelet runs pretty close to when iptables restore runs. Due to the close proximity, we notice there are nodes with FORWARD DROP that came back from the restore (we think) since it may happen after kubelet's FORWARD ACCEPT. Given that kubelet and ipstable-restore both are only depending on docker service to finish https://github.com/awslabs/amazon-eks-ami/blob/master/files/kubelet.service#L4 https://github.com/awslabs/amazon-eks-ami/blob/master/files/iptables-restore.service#L5
Not sure if anyone is running into this or if this is the cause of the issue
An update on the issue I raise. It seems like, for us, an iptable save was performed prior to startup of the node which meant we saved FORWARD DROP. On startup, since kubelet and iptable restore happens around the same time, we run into a race condition where FORWARD DROP may be applied over FORWARD ACCEPT if iptable restore after kubelet start.
I have another question, in https://github.com/awslabs/amazon-eks-ami/blob/v20210830/scripts/install-worker.sh#L99, I'm curious why only a copy there without starting the service? How the service been started?
I have another question, in https://github.com/awslabs/amazon-eks-ami/blob/v20210830/scripts/install-worker.sh#L99, I'm curious why only a copy there without starting the service? How the service been started?
Sorry, found it in bootstrap.sh
.
Is it a really good idea to have a move here https://github.com/awslabs/amazon-eks-ami/blob/v20210830/scripts/install-worker.sh#L99 (as opposed to having a copy)?
It cost me a day to figure out why re-running bootstrap.sh was failing to start my cluster :/
I don't know why this was added in #90, I think the reviewers overlooked it. It's not clear to me why this would be necessary, as components like kube-proxy
that rely on iptables rules should be managing those themselves.
@nithu0115 was the author, who might be able to provide more info. Otherwise, I'll try removing it when I have some time.