AWS ECS agent does not start in EC2 instance
Summary
AWS ECS agent does not start in EC2 instance
Description
It looks like there might be an issue with the ECS agent on my ECS cluster. For the past two weeks, my ECS cluster with EC2 instances managed by auto scaling (launch templates) and capacity provider has been working fine. However, new instances are not being connected to the ECS cluster because the agent is not starting anymore.
Even when I try to start the ECS agent manually on the instance, it hangs.
The Docker service is running properly, and the proper ECS role is attached to the instance. There are no logs for the agent on the instance.
The AMI I'm using is "amzn2-ami-ecs-hvm-2.0.20240319-x86_64-ebs" with the ID "ami-06ebbcdf40f9949e7." Already tried some new AMI versions but facing the same issue.
Here's the ECS service status on a freshly launched EC2 instance:
ecs.service - ECS Agent
Loaded: loaded (/usr/lib/systemd/system/ecs.service; enabled; vendor preset: disabled)
Active: inactive (dead)
Expected Behavior
ECS agent service starting automatically.
Environment Details
- docker info:
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., v0.0.0+unknown)
Server:
Containers: 2
Running: 2
Paused: 0
Stopped: 0
Images: 5
Server Version: 20.10.25
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 64b8a811b07ba6288238eefc14d898ee0b5b99ba
runc version: 4bccb38cc9cf198d52bebf2b3a90cd14e7af8c06
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 4.14.336-257.566.amzn2.x86_64
Operating System: Amazon Linux 2
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 14.91GiB
Name: ip-10-0-4-66.us-west-2.compute.internal
ID: Q2XT:HGAQ:XXEK:T7OZ:7C2Y:WJYW:44JQ:OWYE:VUG5:FSOB:QBAV:MPPK
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
- curl http://localhost:51678/v1/metadata
"Version":"Amazon ECS Agent - v1.82.1
Supporting Log Snippets
- journalctl -xeu ecs.service
---no entries---
Hello Thiago,
Has anything changed since you posted this last week? Unfortunately, we can't diagnose the issue without any logs. For further investigation, we suggest using amazon-ecs-logs-collector and providing us with the logs. For a temporary mitigation, we suggest you try a few older AMI versions (since the new ones do not work based on your previous attempts)
@hozkaya2000 thank you for your answer. I'll take a look at amazon-ecs-logs-collector .
Regardless of the AMI version, the agent isn't working now, even though it was fine earlier. I have tried using an AMI that worked previously, but it is still not working.
Closing this due to lack of activity. @thiagoscodelerae please reopen if you are still facing issues and can provide us logs from you container instance.