terraform-provider-aws
terraform-provider-aws copied to clipboard
ClientException: Too many concurrent attempts to create a new revision of the specified family.
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform Version
Terraform v0.12.5
Affected Resource(s)
- aws_ecs_task_definition
Terraform Configuration Files
data "template_file" "task_definition__backend" {
template = file("${path.module}/task_definitions/backend.json")
vars = {
image_url = "1111111111111111.dkr.ecr.us-east-1.amazonaws.com/my-repo-here/backend:${var.version_tag}"
container_name = "backend"
log_group_region = data.aws_region.current.name
log_group_name = aws_cloudwatch_log_group.app.name
}
}
data "template_file" "task_definition__frontend" {
template = file("${path.module}/task_definitions/frontend.json")
vars = {
image_url = "1111111111111111.dkr.ecr.us-east-1.amazonaws.com/my-repo-here/frontend:${var.version_tag}"
container_name = "frontend"
log_group_region = data.aws_region.current.name
log_group_name = aws_cloudwatch_log_group.app.name
}
}
resource "aws_ecs_task_definition" "backend" {
family = local.ecs_cluster_name
container_definitions = data.template_file.task_definition__backend.rendered
network_mode = "awsvpc"
}
resource "aws_ecs_task_definition" "frontend" {
family = local.ecs_cluster_name
container_definitions = data.template_file.task_definition__frontend.rendered
network_mode = "awsvpc"
}
resource "aws_ecs_service" "backend" {
name = "${local.ecs_cluster_name}_backend"
cluster = aws_ecs_cluster.ecs_cluster.id
task_definition = aws_ecs_task_definition.backend.arn
desired_count = "1"
deployment_minimum_healthy_percent = 100
deployment_maximum_percent = 300
network_configuration {
subnets = aws_subnet.private_subnet.*.id
security_groups = [
aws_security_group.sg_for_ec2_instances.id]
}
load_balancer {
# Register the ECS service within the ALB target group
# This makes the service participate in health checks
# and receive traffic when healthy
target_group_arn = aws_alb_target_group.target_group_backend.arn
container_name = "backend"
container_port = "80"
}
service_registries {
registry_arn = aws_service_discovery_service.service_discovery.arn
container_name = "backend"
container_port = 80
}
depends_on = [
aws_alb_listener.http_traffic,
]
}
resource "aws_ecs_service" "frontend" {
name = "${local.ecs_cluster_name}_frontend"
cluster = aws_ecs_cluster.ecs_cluster.id
task_definition = aws_ecs_task_definition.frontend.arn
desired_count = "2"
deployment_minimum_healthy_percent = 100
deployment_maximum_percent = 300
network_configuration {
subnets = aws_subnet.private_subnet.*.id
security_groups = [
aws_security_group.sg_for_ec2_instances.id]
}
load_balancer {
target_group_arn = aws_alb_target_group.target_group_frontend.arn
container_name = "frontend"
container_port = "80"
}
service_registries {
registry_arn = aws_service_discovery_service.service_discovery.arn
container_name = "frontend"
container_port = 80
}
depends_on = [
aws_alb_listener.http_traffic,
aws_ecs_service.backend,
]
}
Expected Behavior
Running terraform apply again and again should not cause any errors. I expect that AWS task definitions get updated properly.
Actual Behavior
AWS task definitions don't get updated and an error is thrown approximately 1 out of 5 attempts. If I rerun terraform apply another time, it usually works.
Error: ClientException: Too many concurrent attempts to create a new revision of the specified family.
status code: 400, request id: efce29cc-a021-4d6b-b603-d84c8b7a91fa
Steps to Reproduce
terraform apply
Important Factoids
Nothing special. Just two ECS services and the corresponding task definitions for them. It's worth noting that they are both within the same "family". Maybe this has some impact?
I'm running into the same issue. I reduced the Terraform configuration to make it easier to reproduce it (the left out facts are the same as in the initial post):
Terraform Version
Terraform v0.12.7 + provider.aws v2.26.0
Terraform Configuration Files
resource "aws_ecs_task_definition" "this" {
count = 2
family = "test-family"
container_definitions = jsonencode([{
name = "test"
image = "dummy"
memory = 512
}])
}
Important Factoids
The error can be circumvented by running terraform apply -parallelism=1, but this slows down the execution time up to factor 10 compared to the default parallelism.
When you set count = 1 it applies without errors, but of course only generates a single resource.
I had similar issue. I was able to fix it by using different family for each task definition. Using for example for_each on a map instead of count, then family = "local.ecs_cluster_name-${each.key}"
You should have two task definitions with different values for family. One for the frontend, one for the backend. https://docs.aws.amazon.com/AmazonECS/latest/userguide/task_definition_parameters.html#family
When you register a task definition, you give it a family, which is similar to a name for multiple versions of the task definition.
Task definition has nothing to do with your cluster, you can use the same in many clusters, or on many services. But if each service runs a different set of containers, that's a different task definition.
still an issue 2 years later lol
for terragrunt, use:
--terragrunt-parallelism 4
see https://terragrunt.gruntwork.io/docs/features/execute-terraform-commands-on-multiple-modules-at-once/#limiting-the-module-execution-parallelism
Hey y'all :wave: Thank you for taking the time to file this issue and for the continued discussion! Given that there's been a number of AWS provider releases since this was initially filed (and since the last update), can anyone confirm whether you're still experiencing this behavior?
I've just seen this issue when deploying via Terraform Cloud :(
TF version is 1.0 and the AWS provider is specified as "~> 3.63.0"
Yes, I'm also still seeing this. Just hit it now actually which brought me here. I have a root module that deploys 2 target groups of tasks with different versions of the same task family so we can switch back and forth via the load balancer if needed for a blue/green style deployment. Whenever there is a change to our terraform code and both container groups are active we run into this issue.
Still an issue, ran into it today with provider 4.1
Still an issue with hashicorp/aws 4.14
Re-applying a couple of times made it work for me (some tasks were created at each apply).
Still running into this issue with creating 1 ECS cluster. Running the apply back to back usually gets over it.
Still an issue when using hashicorp/aws v4.15.1.
Yep, still receiving this issue as well with TF 1.2.9 and the latest aws provider. Just have to re-run the apply to fix it.
Same for me. The only solution would be to create many families like proposed on other comments
Still a problem on provider v4.32.0
Same issue for for me as well. Re running works fine with same family.
Encountered this problem today as well while applying just three containers sharing the same family into a single cluster. Had to give each of them unique family IDs to circumvent which, admittadly, is not a terrible workaround.
Edit: Appending an incremental integer to each family name did not work for me. Hm...