amazon-eks-ami
amazon-eks-ami copied to clipboard
Use kernel 5.10
Issue #, if available: #857
Description of changes: Upgrade kernel version to 5.10 for kubernetes version above ~~1.19~~ 1.22. It's useful for wireguard transparent encryption, it also includes performance improvements for Intel Ice Lake processors and AWS Graviton2 processors.
More details here
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Hi, can I have input on this PR ? Is it something possible ? Should we build additional ami with kernel-5.10 in the name ? Thanks in advance for your response.
I'm on board with this, 5.10
receives the same level of support from AL2 as 5.4
. We just need to be confident that the change doesn't impact our users' workloads.
If you've deployed this change to a production environment, we'd love to hear about it.
Hi @cartermckinnon , thanks for your response, I'm pretty confident that it will not break things, we already use this kernel on kops clusters without any issues (I agree with you, it's not the EKS ami)
I got feedback from the wider team today: we'll have to make this change in tandem with a Kubernetes release. While 5.10 has largely stabilized, it isn't as battle-tested as 5.4; and we shouldn't ask users to undertake this sort of upgrade in the middle of their k8s version's support cycle. I'll update the title to reflect this.
@rtripat @prasad0896 Do you think this is realistic for 1.22?
I modified the pull request to get the kernel 5.10 for kubernetes version >= 1.22 regarding your last comment. 👍
With Dirty Pipe I assume this upgrade may be more pressing?
With Dirty Pipe I assume this upgrade may be more pressing?
My .02: probably not exactly.
The sha of the fix for dirty pipe: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/lib/iov_iter.c?id=9d2231c5d74e13b2a0546fee6737ee4446017903
cinlin lists backported patches in: 5.16.11
, 5.15.25
, 5.10.102
, 5.4.181
, 4.19.231
, 4.14.268
, and 4.9.303
At the time of this writing the latest ami (1.21.5-20220303) is still on 5.4.176-91.338.amzn2
. Its yet unclear to me whether this kernel is modified outside the upstream versioning -- it'd be great to get confirmation from this team whether these amzn2
kernel versions are expected to precisely match upstream.
That is to say, the shortest path to patching dirty pipe seems to be 5.4.181
, which is a somewhat distinct issue from upgrading to 5.10
-- since they're talking about bundling with a 1.22
upgrade, etc.
[edit] I didn't see a different issue for this so I just filed https://github.com/awslabs/amazon-eks-ami/issues/882 to hopefully clarify the situation
@cartermckinnon since EKS 1.22 has been released and this didn't land, is there any idea when we can get this kernel upgrade landed?
We have specific workloads that require io_uring that aren't supported in 5.4 and have workarounds that are not the best at the moment to support it.
I'm available if you need some modifications to do on my side.
We're in a bind here, because:
- At present, 5.10 isn't/wasn't stable enough to use as the default for 1.22, and we doubt that will change in time for 1.23.
- We can't add a bootstrap flag for customers to opt-in to 5.10, because choosing a kernel at runtime isn't acceptable (it requires a reboot).
- We can't ship a AMI variant for things like this, because it blows out our matrix and would change our support contract by requiring a deprecation path.
My best guess is that 5.10 will be the default kernel for the 1.24 release. In the meantime, if your workload necessitates 5.10, you should use a custom AMI.
For others that land here, my workaround for this issue, was to create an EKS node group based on the Bottlerocket OS. The officially supported image for EKS versions 1.20 and above is using kernel 5.10. It is a completely different architecture of a host system, so it might not be suitable for usecases that required deep modifications of the host environment. But if you are just looking to use features of 5.10 kernel in your workloads, like utilizing Wireguard, then Bottlerocket OS 1.5.3 (aws-k8s-1.20) can help.
Just a concern: upgrade to 1.22 will be painful due to the several API deprecations. It's something our team is planning as a mid-term objective.
However, since the current kernel being supported by the current AMI has a High impact CVE we need to solve this in short-term. Meaning it would be largely beneficial to us (and I assume to many users) if this was backported to versions before 1.22 that are still supported, taking into account upgrading past it is not an effortless procedure.
Keeping it only on 1.23 would cause us to only be able to solve this high impact security issue once we migrate all of our clusters' workloads to take into account deprecations from 1.22 and then performing the actual upgrade.
@LCaparelli we're not aware of any active kernel CVE's in our current AMI's; it's possible the one you're referring to has already been addressed. Please follow the instructions here if not.
@cartermckinnon could we get a status update on this for EKS v1.24 (November 2022 I think)?
@cartermckinnon @bwagner5 could we get a status update on this as the first EKS v1.24 AMI has been released and I don't see any mention of the v5.10 kernel? This is going to be a major issue if we can't use tools which require a modern kernel (e.g. eBPF).
You're correct, apologies for not updating the status here. Timelines haven't aligned as planned with 1.24 GA and the 5.10 migration; but we're working on it, and I expect 1.24 AMI's to use 5.10 before the end of the year.
@cartermckinnon does AL 2022 come into this discussion at all? Based on the docs it looks to be using kernel v5.15 by default. Is there a plan to move this to AL 2022 or to add an additional image based on it?
We do plan to use an AL2022 base when possible; my current forecast is 1.25 will ship atop AL2022, assuming the GA dates are reasonably aligned.
1.24 will move to 5.10 in #1118. I'm going to close this PR in favor of that one.