containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[ECS] Full support for Capacity Providers in CloudFormation.

Open coultn opened this issue 4 years ago • 126 comments

CloudFormation does not currently have support for capacity providers in any of the ECS resource types. We will be adding this support in the near future.

coultn avatar Dec 05 '19 19:12 coultn

Related to this, in order to support capacity providers with managedTerminationProtection, we also need to be able to set the new-instances-protected-from-scale-in property when creating the ASG via CloudFormation. This latter property was added 4 years ago to the AWS SDK / AWS CLI, but is still not supported in CF -- hopefully full support for CP in CF is added a bit faster.

lawrencepit avatar Jan 03 '20 02:01 lawrencepit

Has there been any progress made on this?

Add support for Capacity providers #1

geof2001 avatar Jan 07 '20 16:01 geof2001

We are working on it and will provide updates as soon as more information is available.

coultn avatar Jan 07 '20 17:01 coultn

Related to this, in order to support capacity providers with managedTerminationProtection, we also need to be able to set the new-instances-protected-from-scale-in property when creating the ASG via CloudFormation. This latter property was added 4 years ago to the AWS SDK / AWS CLI, but is still not supported in CF -- hopefully full support for CP in CF is added a bit faster.

Additionally, when the new-instances-protected-from-scale-in property is set on ASG, scheduled action to scale-in instances could not be executed. Feature like force-scale-in for scheduled actions would be useful if for example we have dev env and we would like to turn off instances for night and turn them back on in the morning.

psuj avatar Jan 10 '20 10:01 psuj

+1

pparth avatar Jan 21 '20 08:01 pparth

When this is implemented, will it be possible to do a rolling update to the launch template under autoscaling and a change to a service in ecs, such that the new tasks run on instances from the new launch template while the old ones stay on the old instances as they roll over?

I'm struggling to achieve this with custom resources at the moment, partly as the dependencies are all in funny directions. Would be great to have it all defined declaratively in cfn.

tobymiller avatar Jan 22 '20 12:01 tobymiller

Cross-linking the resp. request in https://github.com/aws-cloudformation/aws-cloudformation-coverage-roadmap/issues/301

sopel avatar Feb 05 '20 17:02 sopel

Any ETA on this?

RomanCRS avatar Mar 13 '20 13:03 RomanCRS

Does this depend on #632?

pauldraper avatar Mar 26 '20 14:03 pauldraper

Does this depend on #632?

I think no.

RomanCRS avatar Mar 26 '20 14:03 RomanCRS

Sadly, that's the reason why using CloudFormation is becoming more and more frustrating.

andreaswittig avatar Mar 27 '20 19:03 andreaswittig

FWIW, Terraform has supported this since shortly after the API was released: https://github.com/terraform-providers/terraform-provider-aws/pull/11151

Of course, it can't delete capacity providers since there's no API: https://www.terraform.io/docs/providers/aws/r/ecs_capacity_provider.html

gabegorelick avatar Apr 06 '20 13:04 gabegorelick

I don't want to use, rely on and support third-party software if I have a chance to use the official product.

RomanCRS avatar Apr 06 '20 15:04 RomanCRS

any update?

Vince-Cercury avatar Apr 21 '20 04:04 Vince-Cercury

same here, any updates?

XBeg9 avatar Apr 27 '20 22:04 XBeg9

any update?

ronan-cunningham avatar May 02 '20 10:05 ronan-cunningham

the lack of Cfn support for this 6 months in is really disappointing. This puts the burden on anyone building CI/CD using Cfn to add additional and silly custom cli/sdk pieces to actually tie in capacity providers, which then have to be ripped out once the support that should be part of a point release is in place. You can do better. Communicating timeframes would help as well.

darrenweiner avatar May 06 '20 18:05 darrenweiner

Have you had a deeper look into Capacity Providers and Cluster Auto Scaling? Does not match with my requirements at all. Does not scale down properly. Does not work with CloudFormation rolling updates for the ASG. So missing CloudFormation support is not the only problem here. :)

andreaswittig avatar May 06 '20 19:05 andreaswittig

Have you had a deeper look into Capacity Providers and Cluster Auto Scaling? Does not match with my requirements at all. Does not scale down properly. Does not work with CloudFormation rolling updates for the ASG. So missing CloudFormation support is not the only problem here. :)

Thanks for the feedback - can you explain more what you mean by "does not scale down properly"?

coultn avatar May 06 '20 20:05 coultn

coultn: Here's what I think is a common use case: A CI/CD pipeline where services are spun up on an ASG backed EC2 cluster.
Services do not pre-exists, the CI/CD creates them. Currently, you can not use cfn to create a capacity provider enabled service. If the underlying cluster doesn't have the memory or cpu, I would expect that when a new service is deployed, it would add another ec2 and deploy the new service..but there's no way to do that currently. I suppose what might work right now is: Deploy the service with no capacity provider, perhaps with a quantity of 0 so it stabilizes, then via the cli, update the service to use a capacity provider, then another cli call to increase the quantity to 1....but that seems like hoop jumps. With regards to down scaling, in reading the documentation, it seems a bit unclear on exactly how this is meant to work: If the goal is to optimize resources, I would actually want the cp to be intelligent enough to a) determine that the cluster is currently overprovisioned and b) if so, drain EC2 accordingly and have the ASG terminate the drained instance...all with standard, appropriate cooldown periods, etc.

darrenweiner avatar May 06 '20 21:05 darrenweiner

Currently, you can not use cfn to create a capacity provider enabled service.

Thanks for the feedback! We are working on full support for capacity providers in CloudFormation, and we definitely understand the need for that. However, I do want to point out that you can actually create a capacity-provider enabled service in CloudFormation today. You can accomplish this by first configuring a default capacity provider strategy for the cluster. This default capacity provider strategy will be used by any service you create that does not specify a launch type. Next, when you create your service in CloudFormation, do not include the LaunchType parameter. The service will use the capacity provider strategy defined by the cluster, and will auto-scale from zero instances if necessary.

With regards to down scaling, in reading the documentation, it seems a bit unclear on exactly how this is meant to work: If the goal is to optimize resources, I would actually want the cp to be intelligent enough to a) determine that the cluster is currently overprovisioned and b) if so, drain EC2 accordingly and have the ASG terminate the drained instance...all with standard, appropriate cooldown periods, etc.

Understood. In the first version of ECS cluster auto scaling, we took a more conservative route where instances would not scale in unless no tasks are running on them. We are looking at the idea of automating an "instance drainer" that will automatically find underutilized instances and set them to draining. With ECS cluster auto scaling, those instances would automatically shut down once no tasks are running on them. It's possible to do this already today, but you would need to implement your own Lambda function (or similar) to do the evaluation of the instance and call the ECS API to set the instance to the DRAINING state.

coultn avatar May 06 '20 21:05 coultn

Really awesome feedback, thank you. As far as the workaround for setting it at Cluster creation, I'll take a look at that..easy enough to implement for QA/Dev..a little trickier for existing prod environments.

Trying to avoid custom tooling since...this seems sooo close to being a solid solution.

Any timing on better cfn support? I know that's a different, probably very overwhelmed team, but would be nice to see some improvements here. ECS rocks, and once this is dialed in, it's going to really round out the offering.

Will keep checking for ECS updates!

darrenweiner avatar May 07 '20 01:05 darrenweiner

Dear colleagues, Please, in CF, provide the opportunity of fine tune some Capacity Provider auto generated parameters. Currently, in addition the the current parameters, we need the adjust the Cooldown in the Auto Scaling Plan manually, as well the Alarms datapoints, all after the Capacity Provider creation. It would be great put all this together in the CF script. This is a must for us. Thank you very much!

marcelmunarolo avatar May 07 '20 17:05 marcelmunarolo

Regarding timeline - we can't share specific timelines but we will share updates here as soon as they are available.

coultn avatar May 07 '20 17:05 coultn

coultn: Because this is such a useful feature for so many of my clients, I decided to re-tool things today.
Unfortunately, capacity providers still doesn't seem to work. The cluster default cp is in place. I re-created services without the LaunchTemplate reference, and it clearly shows the services are using the capacity provider strategy. However, when I deploy services and exhaust the memory, it throws the usual message saying it can't find a container with the resources. Interestingly, and probably to the point: The cloudwatch metric for the cp that is assigned to this cluster (CapacityProviderReservation) isn't reporting any metrics at all. I have seen this metric chart more appropriately in previous tests a few weeks ago with another client...no idea why it's not reporting anything. I spun up about 5-8 services today on this cluster using the cp strategy.... I'll just keep checking back for updates...hopefully some good changes coming soon.

darrenweiner avatar May 07 '20 19:05 darrenweiner

+1

rcrelia avatar May 12 '20 17:05 rcrelia

This is definitely a showstopper for our CDK-powered automation workflows. Setting Capacity Provider on a cluster level is something CloudFormation team is looking into. https://github.com/aws-cloudformation/aws-cloudformation-coverage-roadmap/issues/301

In the meantime our workaround is to run following aws-cli command in our ci/cd workflow:

aws ecs put-cluster-capacity-providers \
    --cluster CLUSTER_NAME \ 
    --capacity-providers FARGATE \ 
    --default-capacity-provider-strategy capacityProvider=FARGATE

I really hope this ships soon. 🤞

robertd avatar May 12 '20 19:05 robertd

+2

jakebanks avatar Jun 15 '20 06:06 jakebanks

Deletion is now supported by the API. Will this accelerate the implementation of this feature addition?

https://aws.amazon.com/jp/about-aws/whats-new/2020/06/amazon-ecs-capacity-providers-support-delete-functionality/

hatappo avatar Jun 17 '20 00:06 hatappo

+1

pramshar avatar Jun 19 '20 17:06 pramshar