bottlerocket icon indicating copy to clipboard operation
bottlerocket copied to clipboard

add a new aws-ecs variant for kernel 5.15

Open bcressey opened this issue 3 years ago • 6 comments
trafficstars

What I'd like: I'd like a new variant for ECS that includes the new 5.15 kernel.

The ECS Agent works to preserve compatibility across its releases, so there's no upstream version to synchronize with in this case; it's still on 1.x versions.

With that in mind, possible options for a successor to aws-ecs-1 include:

  • aws-ecs-1.1 - add a ".1" to reflect a new revision
  • aws-ecs-2 - increment "1" to reflect a significant difference in the new variant
  • aws-ecs-2022 - switch to year-based numbering to avoid version confusion with the agent

Any alternatives you've considered: Introducing the 5.15 kernel in a future update to the aws-ecs-1 variant, but this breaks with the compatibility promise we make for in-place updates.

bcressey avatar Jun 01 '22 17:06 bcressey

@mello7tre and @samuelkarp - I'd love to get your input here.

bcressey avatar Jun 01 '22 17:06 bcressey

Hi, i would go for aws-ecs-1.1.

Both aws-ecs-2 and aws-ecs-2022 represent a significant change, that it's better to use for the agent version update or for other big changes related to the ecs world (and in that case i would choose aws-ecs-2 to have a consistent progression with the current one.) Kernel update, is a big change, but is not directly related to ecs.

mello7tre avatar Jun 02 '22 10:06 mello7tre

I don't know that I have enough information to really answer your question, but I'll ask a few questions/try to write down some observations that I think might help you think about it.

How long would you plan to maintain multiple kernels for different variants? How easy or difficult would it be for ECS customers to migrate between variants? (Can they do that with apiclient or must they launch new, replacement AMIs?)

The name of the first variant (aws-ecs-1) was intended to allow expressing the evolution of the variant. I think that if you consider a kernel change to be a major change worthy of a new variant, aws-ecs-2 is a reasonable choice. However, I wonder if there's too much information implicit in the variant; right now it represents:

  • the target platform ("aws")
  • the orchestrator ("ecs")
  • an orchestrator version (in the case of the Kubernetes variants)
  • some sort of notion of a major change (in the case of the ECS variant?)

Maybe another way to think about this is: if you add support for a non-AWS target platform for Bottlerocket & ECS (such as VMware or another cloud provider), would you start the numbering at 2/1.1/2022 in order to say that anything ending in "ecs-2" has the 5.15 kernel?

samuelkarp avatar Jun 07 '22 01:06 samuelkarp

How long would you plan to maintain multiple kernels for different variants? How easy or difficult would it be for ECS customers to migrate between variants? (Can they do that with apiclient or must they launch new, replacement AMIs?)

They would need to launch replacement AMIs because we do not have #1261

Note that our recent crate which gives names to the positions of the variant tuple calls this position version, or we could think of it as Variant::version (we did not name it, for example, orchestrator_version): https://github.com/bottlerocket-os/bottlerocket/blob/6b7f020a5169a32409bdbd0cb4cbe79e66845b05/sources/bottlerocket-variant/src/lib.rs#L147

I'm not sure that helps but I'm just pointing out that we recently commited to naming the tuple position something.

webern avatar Jun 08 '22 06:06 webern

They would need to launch replacement AMIs because we do not have https://github.com/bottlerocket-os/bottlerocket/issues/1261

So if that's the case, the eventual sunsetting of the aws-ecs-1 variant might leave customers who've deployed the ECS updater without automatic upgrades in their clusters? Or would addressing #1261 be a prerequisite for that?

The new crate is interesting; I had referred to it as the "orchestrator version" primarily because that's what it seems to be for the Kubernetes variants specifically; the 1 in the aws-ecs-1 name is not really tied to an orchestrator version since ECS doesn't have different versions of the orchestrator (only different versions of the agent).

samuelkarp avatar Jun 08 '22 07:06 samuelkarp

So if that's the case, the eventual sunsetting of the aws-ecs-1 variant might leave customers who've deployed the ECS updater without automatic upgrades in their clusters?

To some extent, that is potentially true. The customers that use the ECS updater will have to replace their hosts, just as k8s users will do in their clusters when they switch from one kubernetes version to the other.

The new crate is interesting; I had referred to it as the "orchestrator version" primarily because that's what it seems to be for the Kubernetes variants specifically; the 1 in the aws-ecs-1 name is not really tied to an orchestrator version since ECS doesn't have different versions of the orchestrator (only different versions of the agent).

I've been thinking about this too, and I thought that maybe we could use ecs-2 disregarding the ECS agent's version, but it is still possible the ECS folks release a v2 ECS agent, for which we will probably have an ecs-3 and this could be confusing for customers.

@samuelkarp made a great point about the support of multiple kernels. I was concerned about proliferation of kernels, but @dmitmasy pointed out that the public facing documentation mentions we will provide support for 3 years for a given variant (unless the agent is deprecated like in k8s variants) so we will be able to deprecate kernels as we deprecate variants.

would you start the numbering at 2/1.1/2022 in order to say that anything ending in "ecs-2" has the 5.15 kernel?

I think we will, and for the new ECS variant we will release aws-ecs-1.1 and aws-ecs-1.1-nvidia, and the ecs-1.1 in the name implies both were released with the 5.15 kernel. I imagine the same logic will apply if we ever release a vmware-ecs-1.1 variant.

aws-ecs-2022 - switch to year-based numbering to avoid version confusion with the agent

I think that with this versioning customers may expect yearly releases and that might not be the case, what if there is a new LTS kernel or ECS agent this year? Will we use aws-ecs-2022.1?

arnaldo2792 avatar Jun 24 '22 00:06 arnaldo2792

@arnaldo2792 we will also want to make sure the new variant uses cgroup v2, and update daemon.json to use the systemd driver.

bcressey avatar Apr 04 '23 22:04 bcressey