amazon-eks-ami
amazon-eks-ami copied to clipboard
Change cgroup driver to systemd
Kubernetes documentation indicates that for stability reasons one should run kubernetes with the systemd cgroup driver if the init system itself is systemd.
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#cgroup-drivers
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
https://github.com/awslabs/amazon-eks-ami/pull/587 reverted this change earlier, now that eksctl was made compatible with this change in https://github.com/weaveworks/eksctl/pull/2962 it should be OK to merge again.
This is another run at https://github.com/awslabs/amazon-eks-ami/issues/490
Hi, now that eksctl has released a change, we're looking into how to make this release possible. There's still a condition where if we release an AMI with this change and a user is using an older version of eksctl, this could still lead to failures. We will be update this issue with more details, once we have a plan to pick this change.
@abeer91 I do understand the concern of breaking downstream tooling, so I might look into how that could be handled, although I can't promise anything (thinking about maybe using a path unit to keep the kubelet config in sync with the daemon.json but that is very hacky).
How long do you think we should wait for the newer eksctl to be adopted? Does it warrant a changelog note that this might break some tooling downstream (tools that assume only kubelet config needs to be changed)?
Hi, upon cluster creation with GPU instances i could not fetch nodes with kubectl get nodes.
I see the below error on kubelet logs
failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
eksctl version : 0.51.0
eks version: 1.20
kubectl version: v1.21.0
AMI: ami-0b450512a684db582
Will the above fix solves this issue. @abeer91 when can we expect this fix.
@reegnz you might want to add systemd cgroup support to containerd in this PR now it's been released.
@abeer91 if this is still being blocked due to concerns about eksctl how about a containerd only PR (see https://github.com/aws/containers-roadmap/issues/1210#issuecomment-888176554)?
note: systemd cgroup support to containerd is not correct. We are in runc v2 not in v1
@reegnz you might want to add systemd cgroup support to containerd in this PR now it's been released.
@abeer91 if this is still being blocked due to concerns about
eksctlhow about a containerd only PR (see aws/containers-roadmap#1210 (comment))?
@josephprem ~~I can see that runc cgroup v2 is supported but is it actually enabled in the AMI?~~
I can't find any official documentation with the best being reading https://github.com/containerd/containerd/issues/4203 (specifically https://github.com/containerd/containerd/issues/4203#issuecomment-651532765)
@josephprem ~I can see that runc cgroup v2 is supported but is it actually enabled in the AMI?~
I can't find any official documentation with the best being reading containerd/containerd#4203 (specifically containerd/containerd#4203 (comment))
yes it is
cat /etc/containerd/config.toml
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
[grpc]
address = "/run/dockershim.sock"
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
@josephprem I re-read the config and that's why I struck the comment through about v2 being enabled. I'm still interested if you have any better documentation than the comment I found and linked above.
@josephprem I re-read the config and that's why I struck the comment through about v2 being enabled. I'm still interested if you have any better documentation than the comment I found and linked above.
https://github.com/containerd/cri/blob/master/docs/config.md
I have added this in my user-data
conf=/etc/eks/containerd/containerd-config.toml
# EKS bootstrap script moves $conf to /etc/eks/containerd/config.toml
section="\[plugins.\"io.containerd.grpc.v1.cri\".containerd.runtimes.runc.options\]"
key="SystemdCgroup"
grep -q $section $conf || sed -i "/^runtime_type.*/a $section" $conf
grep -q $key $conf || sed -i "/$section.*/a $key = true" $conf
Followed by enabling --cgroup-driver=systemd in kubelet ( note the service file patched )
sed -i 's/KUBELET_EXTRA_ARGS/KUBELET_EXTRA_ARGS $EXTENDED_KUBELET_ARGS/' /etc/eks/containerd/kubelet-containerd.service
cat << EOF > /etc/systemd/system/kubelet.service.d/9999-extended-kubelet-args.conf
[Service]
Environment='EXTENDED_KUBELET_ARGS=--cgroup-driver=systemd'
EOF
systemctl daemon-reload
@josephprem I assume that you could modify /etc/kubernetes/kubelet/kubelet-config.json like the changes in this PR instead of setting --cgroup-driver=systemd?
/etc/kubernetes/kubelet/kubelet-config.json
I guess so , but my preference for pushing variables is through systemd Drop-Ins
Hey folks 👋 eksctl here. A couple of things to note on our side:
- We have changed the way that
eksctlbootstraps: we no longer roll our own, and simply call the/etc/eks/bootstrap.shto delegate to whatever is going on in the AMIs. Thereforeeksctlwill not need to do anything with cgroup stuff. (See details of this decision here https://github.com/weaveworks/eksctl/pull/3564) - HOWEVER: The above switch comes with a breaking change for those using Unmanaged Nodegroups with Custom AMIs, therefore the legacy codepath is still in place for a little while longer. (See notice here https://github.com/weaveworks/eksctl/issues/3563.) On the legacy path we have removed previously added changes which set the cgroup driver. This was because we kept experiencing breakages every time the AMI folks moved something around (this pain was actually one of the reasons why we removed our own bootstrapper). This could mean that if the custom AMI is based on one which sets the docker driver to be
systemd, the kubelet will fail becauseeksctl's legacy bootstrapper will override the kubelet config, setting the driver tocgroupfs.- BUT, users can get by this by using the
overrideBootstrapCommandto set their ownuserdata.
- BUT, users can get by this by using the
Please note that we are no longer making changes to the legacy path.
To address this point:
There's still a condition where if we release an AMI with this change and a user is using an older version of eksctl, this could still lead to failures.
We emphatically do not support older versions of eksctl. If users are running any version of eksctl which is not latest, they are doing it wrong 🙃 .
We emphatically do not support older versions of eksctl. If users are running any version of eksctl which is not latest, they are doing it wrong 🙃 .
@Callisto13 thanks for emphasizing that!
@abeer91 so now that being said, what is still blocking the merging of this PR?
@reegnz I think what @Callisto13 was saying is that the latest version of eksctl still supports a legacy flow for compatibility and that is what's currently blocking this? This is just an observation; personally I don't think a higher level project should dictate the direction of a dependency, this is why strong APIs/contracts are essential.
@stevehipwell Legacy flow seems to be supported for Custom AMIs, not the official EKS AMI, for the official EKS AMI they use the /etc/eks/bootstrap.sh, no legacy code-path there. We're talking about enabling systemd cgroup driver for the official AMI, not custom AMI-s. The custom AMI problem has been handled on their side.
@reegnz I think the concern was that someone using legacy flow with a custom AMI built from this AMI would be broken after this PR was merged?
I want to verify that there are no outstanding concerns/issues, and I'll try to get this merged this week.
I'm going to close this; we've made systemd the cgroup driver for containerd, and we don't plan to touch docker until we remove it in 1.25.