terraform-aws-ecs
terraform-aws-ecs copied to clipboard
Maximum two tasks are running on one instance
Description
Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration (see the examples/* directory for references that you can copy+paste and tailor to match your configs if you are unable to copy your exact configuration). The reproduction MUST be executable by running terraform init && terraform apply without any further changes.
If your request is for a new feature, please use the Feature request template.
- [x] β I have searched the open/closed issues and my issue is not listed.
β οΈ Note
Before you submit an issue, please perform the following first:
- Remove the local
.terraformdirectory (! ONLY if state is stored remotely, which hopefully you are following that best practice!):rm -rf .terraform/ - Re-initialize the project root to pull down modules:
terraform init - Re-attempt your terraform plan or apply and check if the issue still persists
Versions
-
Module version [Required]: 5.11.2
-
Terraform version: ~> 1.6.3
- Provider version(s): hashicorp/aws: ~> 5.31
Reproduction Code [Required]
Steps to reproduce the behavior:
I'm not using terraform workspaces. I cleared local cache.
Expected behavior
Run more than two tasks on one instance (type: t3a.medium but I tried also run them on example m6a.large and the same issue)
Actual behavior
I running example 4 services in ECS. Every of them has dedicated 512CPU and 512 MEM. Instance type t3a.medium has 2048 CPU and 3883 memory. I tried also modify these services to 256CPU and 512MEM, but it is still not working as expected. ECS service automatically connect two of tasks to one instance and no more - I don't know why.
Terminal Output Screenshot(s)
Additional context
ecs.tf:
module "ecs" {
count = var.tags.Environment == "prod" ? 1 : 0
source = "terraform-aws-modules/ecs/aws"
cluster_name = local.ECS_CLUSTER_NAME
tags = local.tags
cluster_configuration = {
execute_command_configuration = {
logging = "OVERRIDE"
log_configuration = {
cloud_watch_log_group_name = "aws/ecs/aws-ec2/COMPANY_NAME-${local.project_name}"
}
}
}
default_capacity_provider_use_fargate = false
task_exec_secret_arns = [
.......... protected ..............
]
autoscaling_capacity_providers = {
rit-1-app = {
auto_scaling_group_arn = module.autoscaling-apps[0].autoscaling_group_arn
managed_termination_protection = "DISABLED"
managed_scaling = {
maximum_scaling_step_size = 2
minimum_scaling_step_size = 1
status = "ENABLED"
target_capacity = 70
}
}
}
(local.apps.pdf-printer-prod.name) = {
subnet_ids = data.terraform_remote_state.vpc.outputs.vpc-config.private_subnets
requires_compatibilities = ["EC2"]
cpu = 512
memory = 512
create_security_group = true
security_group_rules = {
alb_ingress = {
type = "ingress"
from_port = local.apps.pdf-printer-prod.container_port
to_port = local.apps.pdf-printer-prod.container_port
protocol = "tcp"
description = "Service port"
source_security_group_id = aws_security_group.alb_sg[0].id
}
egress_all = {
type = "egress"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
capacity_provider_strategy = {
rit-1-app = {
capacity_provider = module.ecs[0].autoscaling_capacity_providers["rit-1-app"].name
base = 1
weight = 1
}
}
load_balancer = {
service = {
target_group_arn = aws_lb_target_group.alb_target_group[local.apps.pdf-printer-prod.name].arn
container_name = local.apps.pdf-printer-prod.name
container_port = local.apps.pdf-printer-prod.container_port
}
}
task_exec_iam_statements = [
{
actions = ["logs:CreateLogGroup"]
effect = "Allow"
resources = ["*"]
sid = "CreateLogGroup"
},
]
container_definitions = {
(local.apps.pdf-printer-prod.name) = {
cpu = 512
memory = 512
memory_reservation = 100
essential = true
image = local.apps.pdf-printer-prod.image
port_mappings = [
{
name = local.apps.pdf-printer-prod.name
containerPort = local.apps.pdf-printer-prod.container_port
protocol = "tcp"
}
]
readonly_root_filesystem = false
enable_cloudwatch_logging = true
log_configuration = {
logDriver = "awslogs"
options = {
awslogs-create-group = "true"
awslogs-group = "/aws/ecs/${local.apps.pdf-printer-prod.name}/logs"
awslogs-region = local.DEFAULT_AWS_REGION
awslogs-stream-prefix = "api"
}
}
}
}
}
}
One more code - autoscaling.tf:
module "autoscaling-apps" {
count = var.tags.Environment == "prod" ? 1 : 0
source = "terraform-aws-modules/autoscaling/aws"
version = "7.3.1"
name = "${local.project_name}-autoscaling-apps-instances"
image_id = jsondecode(data.aws_ssm_parameter.ecs_optimized_ami.value)["image_id"]
instance_type = local.apps_instance_type
user_data = base64encode(
<<-EOT
#!/bin/bash
cat <<'EOF' >> /etc/ecs/ecs.config
ECS_CLUSTER=${local.ECS_CLUSTER_NAME}
ECS_LOGLEVEL=debug
ECS_CONTAINER_INSTANCE_TAGS=${jsonencode(local.tags)}
ECS_ENABLE_TASK_IAM_ROLE=true
EOF
EOT
)
security_groups = [module.autoscaling_sg[0].security_group_id]
create_iam_instance_profile = true
iam_role_name = local.project_name
iam_role_description = "IAM role for ${local.project_name} - autoscaling"
iam_role_policies = {
AmazonEC2ContainerServiceforEC2Role = "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role"
AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
metadata_options = {
http_endpoint = "enabled"
http_tokens = "required"
http_put_response_hop_limit = 1
}
vpc_zone_identifier = data.terraform_remote_state.vpc.outputs.vpc-config.private_subnets
health_check_type = "EC2"
min_size = 3
max_size = 8
desired_capacity = 8
protect_from_scale_in = false
autoscaling_group_tags = {
AmazonECSManaged = true
}
use_mixed_instances_policy = false
enabled_metrics = [
"GroupAndWarmPoolDesiredCapacity",
"GroupAndWarmPoolTotalCapacity",
"GroupDesiredCapacity",
"GroupInServiceCapacity",
"GroupInServiceInstances",
"GroupMaxSize",
"GroupMinSize",
"GroupPendingCapacity",
"GroupPendingInstances",
"GroupStandbyCapacity",
"GroupStandbyInstances",
"GroupTerminatingCapacity",
"GroupTerminatingInstances",
"GroupTotalCapacity",
"GroupTotalInstances",
"WarmPoolDesiredCapacity",
"WarmPoolMinSize",
"WarmPoolPendingCapacity",
"WarmPoolTerminatingCapacity",
"WarmPoolTotalCapacity",
"WarmPoolWarmedCapacity",
]
tags = local.tags
}
Example:
And global view of infrastructure:
As you can see, there is a lot of unused resources which can be allocated in some of others instances, but there are limit to 2 running tasks on instances.
Is your task using awsvpc as the network mode? If so, it will be creating an elastic network interface (ENI) per task, and there's a limit per instance.
There are 3 options that I know of:
- network mode: awsvpc - max is two tasks for a 2 ENI instance like a c7i.large
- network mode awsvps with trunking allow ECS ENI trunking - an instance types that takes two tasks could not take 10 tasks. If differs from instance type to the other based on its networking capabilities. use: aws ecs put-account-setting --name awsvpcTrunking --value enabled --principal-arn arn:aws:iam::999999999:role/ecsInstanceRole --region us-west-1
- The default for ECS is bridge mode which takes a large number of tasks using dynamic port mapping. You will need to open up the necessary port ranges in the security group. (load-balancer health check failure in this setup is usually due to not opening up the ports).
use bridge mode. When you get strict fine tuned security requirements that you cannot meet with bridge mode, you can then reconfigure to use trunking.
This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days
This issue was automatically closed because of stale in 10 days
I'm going to lock this issue because it has been closed for 30 days β³. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.