fck-nat icon indicating copy to clipboard operation
fck-nat copied to clipboard

Handling AWS-enforced public AMI expirations

Open parberge opened this issue 9 months ago • 6 comments

We ran into an issue that the public AMI we filtered for in our terraform code was no longer available (e.g fck-nat-amzn2-hvm-1.2.1-*-arm64-ebs) Is it intentional to not support older versions when you release new AMIs?

Is there any documentation on release cycles and which versions are supported etc?

parberge avatar Feb 14 '25 12:02 parberge

AWS forces deprecation of public AMIs two years after release. You can still access the AMI, but you need to specify that you are willing to accept deprecated images, I'm not sure how this plays with the Terraform provider you are using.

We cannot adjust this deprecation policy, it is enforced by AWS.

AndrewGuenther avatar Feb 14 '25 19:02 AndrewGuenther

Renamed this issue for tracking.

Having AMIs suddenly go deprecated is not a great user experience, but give the nature of AMIs I understand why AWS does it. These base images are extremely out-of-date with critical security patches. I want to think of a sustainable solution for this in the context of this project.

I think the best solution is to build a new AMI with current versions every interval and do not bump the patch version (we include the publish date in the name so this would still bump)

Some benefits:

  1. Versions won't get deprecated. fck-nat is largely feature complete and we can't rely on new releases to keep the AMIs fresh. Even the current version will be deprecated eventually, so we have to do something
  2. Security. Users should be running auto-patch, but having the AMIs be closer to up-to-date on launch would be a good improvement
  3. CI. This would be a good excuse for me to put some time and effort into automated builds and releases. Having CI just run every interval months and deploying AMIs automatically would be pretty slick.

Some drawbacks:

  1. Not bumping the patch version means you're technically getting different content under the same patch version. Now, technically the version number is meant to indicate the version of the fck-nat package, but it's so closely associated with the AMIs I could see this causing confusion. That said, if you really want to lock to a specific AMI you should be referencing the AMI ID.
  2. Users referring directly to the AMI ID will still run into issues with deprecation, but given this forced deprecation timeline I don't think there's a solve for that.
  3. Publishing more AMIs is more cost. Even the handful of AMIs I maintain currently is $40/month. That's not terrible, but publishing an extra AMI every interval would balloon costs over time. I think an explicit version support policy and sufficiently large interval could make this manageable...

AndrewGuenther avatar Feb 15 '25 02:02 AndrewGuenther

On cost of AMI refreshing:

Each AMI costs ~$20/month (arm & x86 copied to every region). So if I publish a "refreshed" AMI every six months for each supported version (let's say 1.2.1 and above) that's 3 versions (once 1.4 is finalized) which means that's an increase of $60/month every 6 months. I don't think that's sustainable without more aggressive deprecation.

AndrewGuenther avatar Feb 15 '25 02:02 AndrewGuenther

Ok thanks for the fast and well explained response!

I didn't know this was removed by AWS. Also being able to accept deprecated images sounds like something that would solve our issue.

If we somehow would know about a new AMI being released that would be ideal. I'm not necessarily saying this is something this project should do.

parberge avatar Feb 15 '25 09:02 parberge

Possible to handle this the way aws does it for their own images?

I dunno why they dont do this for linux ami's, but for windows ami's they are removed after 3 weeks when they publish a new one.

They produce a new ami at patch tuesday, a few days later they publish an sns message alert to all subscribers with the new regional ami images and details.

3 weeks after the sns message goes out, they remove the old ami images.

We had to implement this in our system, so the windows systems we provision can get the launch configs updated with a ami that actually exists when they go to launch, do a lambda consumes the sns notification and updates the launch configs with the new ami.

patrickdk77 avatar Feb 15 '25 18:02 patrickdk77

AWS forces deprecation of public AMIs two years after release. You can still access the AMI, but you need to specify that you are willing to accept deprecated images, I'm not sure how this plays with the Terraform provider you are using.

We cannot adjust this deprecation policy, it is enforced by AWS.

It was quite easy to include deprecated images. This means I have an option to rollback easily if an upgrade to a supported version isn't working for some reason.

I'm OK if you want to close this issue. What would be nice is to know when new AMIs are available. That is something we can ofc implement ourselves.

parberge avatar Feb 18 '25 08:02 parberge