terraform-provider-aws
terraform-provider-aws copied to clipboard
ECS Service always wants to be recreated due to capacity provider.
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform CLI and Terraform AWS Provider Version
$ terraform -v
Terraform v0.13.6
+ provider.aws v3.73.0
Affected Resource(s)
- aws_ecs_service
Terraform Configuration Files
Terraform Plan:
# module.my_service.aws_ecs_service.ecs_service must be replaced
+/- resource "aws_ecs_service" "ecs_service" {
cluster = "arn:aws:ecs:us-west-1:***:cluster/ecs-related-tapir"
deployment_maximum_percent = 200
deployment_minimum_healthy_percent = 100
desired_count = 2
enable_ecs_managed_tags = false
enable_execute_command = false
health_check_grace_period_seconds = 120
~ iam_role = "aws-service-role" -> (known after apply)
~ id = "arn:aws:ecs:us-west-1:***:service/my-cluster/my-service-5e" -> (known after apply)
~ launch_type = "EC2" -> (known after apply)
name = "my-service-service-5e"
+ platform_version = (known after apply)
- propagate_tags = "NONE" -> null
scheduling_strategy = "REPLICA"
- tags = {} -> null
~ tags_all = {} -> (known after apply)
~ task_definition = "arn:aws:ecs:us-west-1:***:task-definition/my-service-:23" -> "arn:aws:ecs:us-west-1:***:task-definition/my-service:1"
wait_for_steady_state = false
+ capacity_provider_strategy { # forces replacement
+ base = 0
+ capacity_provider = "ecs-capacity-provider-related-tapir"
+ weight = 100
}
deployment_controller {
type = "CODE_DEPLOY"
}
load_balancer {
container_name = "my-service"
container_port = 7171
target_group_arn = "arn:aws:elasticloadbalancing:us-west-1:***:targetgroup/abcdef/abcdef"
}
}
Plan: 1 to add, 0 to change, 1 to destroy.
Terraform Apply error:
Error: error creating ECS service (my-service): InvalidParameterException: Creation of service was not idempotent.
Expected Behavior
No infrastructure changes should be made
Actual Behavior
The ECS Service resource will be recreated, but the apply with fail with the error logs specified above.
Steps to Reproduce
- Provision an ECS service with a capacity provider
terraform apply
FYI we are still seeing this bug in the provider version 4.9.
Possibly related to existing issue: https://github.com/hashicorp/terraform-provider-aws/issues/2283 (destroy/create behavior)
*Correction -- as the update was not expected behavior, i'm guessing the capacity_provider_strategy is inherited from the aws_ecs_cluster where it is defined. Do you mind confirming @spatel96 ?
This issue is very destructive.
When an ECS cluster has a default_capacity_provider_strategy setting defined, Terraform will mark all services that don't have
lifecycle {
ignore_changes = [
capacity_provider_strategy
]
}
to be recreated.
It's the only differences I can see when comparing capacity_provider_strategy and deployment_controller are MaxItems and DiffSuppressFunc. I wonder if that is what's causing this recreation... I would have thought that the removing the ForceNew would have also removed recreating capacity_provider_strategy...
https://github.com/hashicorp/terraform-provider-aws/blob/611b4737168f4f0051bb63ef221f0e76f156f392/internal/service/ecs/service.go#L96-L107
https://github.com/hashicorp/terraform-provider-aws/blob/611b4737168f4f0051bb63ef221f0e76f156f392/internal/service/ecs/service.go#L44-L47
Hi @nitrocode thanks for looking through the code! My initial thinking was that @spatel96 is using both the aws_ecs_capacity_provider and aws_ecs_service resources so while capacity_provider_strategy is not explicitly configured in the aws_ecs_service terraform configuration, the value is inherited from the separate aws_ecs_capacity_provider resource after an initial terraform apply, so the next apply or plan will show that diff (though this still just my conjecture as the original configuration is not yet known). And then that diff is handled with this portion of the code
https://github.com/hashicorp/terraform-provider-aws/blob/a2843eb5d274b2fe3598cf863d228e715dacc343/internal/service/ecs/service.go#L354-L372 which is forcing the new resource. The logic needs to account for cases where the provider strategy is inherited from an outside configuration or simply mark the capacity_provider_strategy as Computed so that the diff is ignored.
I was seeing this same issue and can confirm that adding a capacity_provider_strategy block in my aws_ecs_service, duplicating my default_capacity_provider_strategy, resolved it.
This has been a big annoyance for us. We have many production ECS Services that are using LaunchType: EC2 and we'd like to convert them to using a newly defined default Capacity Provider strategy on the cluster.
If we simply set the capacity provider, it will force the re-create of the ECS Service leading to temporary disruption/downtime. This isn't necessary as AWS supports the graceful transition of LaunchType: EC2 to Capacity Provider (but not the other way around). It does a "force new deployment" of the ECS Tasks, but it uses the standard ECS rollout mechanism (e.g., minHealthy) so there's no disruption.
Our current workaround is to use the ignore_changes as above, plus converting ECS Services to Capacity Provider via separate CLI type automation.
(Also, tangentially related is #26533 - for transitioning existing ECS Services to use the Cluster's default capacity provider strategy)
if I may add, empty capacity_provider_strategy list could be useful also
it seems this support was added to the AWS cli and API - https://github.com/aws/containers-roadmap/issues/838#issuecomment-1159092125 so that
$ aws ecs update-service --cluster cluster-name --service service-name --capacity-provider-strategy '[]' --force-new-deployment
removes strategy from a ECS service (when inherited from default defined at the ECS cluster level) which is useful if you're planning to remove the default capacity provider strategy from the ECS cluster
It seems that currently if no capacity_provider_strategy is defined in the aws_ecs_service resource the AWS API call will not have any value set and the default strategy will be used
It's sad to see that It's been over 1 year and still not fixed. :-( AWS has to do a better job than this if they want people to keep using ECS and keep it stay alive.
any updates on this? I see the PR is pending
Any update on this?
@breathingdust Hi, is this something you can look into? The AWS side has been fixed, and now Terraform incorrectly causes replacement.
Issue still exists.
Issue still exists.
Yep we're facing the same problem too
When the fix would be released? It is affecting my team too.
+1
This is a major issue. We are running many FARGATE instances and would like to increase the capacity further by adding FARGATE SPOT instances. However, it is not possible to do without downtime (it destroys the whole ECS service and recreates it).