containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[ECS] [request]: Stop UNHEALTHY ECS Task if container health chech fails during ECS Task start up

Open aws-patrickc opened this issue 2 weeks ago • 0 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request What do you want us to build? Please can ECS add feature to stop UNHEALTHY ECS Task if container health check fails during start up.

Which service(s) is this request for? ECS/Fargate

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? There is a use case where an ECS Task can have multiple containers such as init and three application containers. Let's say each application container have a dependsOn attribute for the previous application container to be HEALTHY, but the previous application container is UNHEALTHY and never transitions to be HEALTHY, i.e. so not all application containers can start in the task. In this scenario ECS Task is always stuck in PENDING status until ECS Scheduler stops it after three hours. This long wait period before task getting stopped doesn't make sense as a failure container health check should ultimately stop/fail the task.

Below is the example from ECS Docs, this should hold true for ECS task that is running (i.e. RUNNING) status or starting (i.e. PENDING) status and task should be stopped if UNHEALTHY:

Consider the following task health example with 3 containers.
- If Container1 is UNHEALTHY and Container2 is UNKNOWN, and Container3 is UNKNOWN, the task health is UNHEALTHY.
- If Container1 is UNHEALTHY and Container2 is UNKNOWN, and Container3 is HEALTHY, the task health is UNHEALTHY.
- If Container1 is UNHEALTHY and Container2 is HEALTHY, and Container3 is HEALTHY, the task health is UNHEALTHY.
- If Container1 is HEALTHY and Container2 is UNKNOWN, and Container3 is HEALTHY, the task health is UNKNOWN.
- If Container1 is HEALTHY and Container2 is UNKNOWN, and Container3 is UNKNOWN, the task health is UNKNOWN.
- If Container1 is HEALTHY and Container2 is HEALTHY, and Container3 is HEALTHY, the task health is HEALTHY.

Are you currently working around this issue? None, waiting for ECS to stop the because it is in PENDING status too long.

Additional context Anything else we should know? AWS Support confirmed that ECS Task will only be stopped from UNHEALTHY container health check on RUNNING tasks. For PENDING tasks no actions is taken.

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

aws-patrickc avatar Jun 19 '24 13:06 aws-patrickc