containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[ECS] [request]: Task state change events on EventBridge not respecting DescribeTasks API

Open henriquesantanati opened this issue 2 years ago • 1 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request

From this documentation: Task state change events are delivered in the following format. The detail section below resembles the **Task object that is returned from a DescribeTasks API operation** in the Amazon Elastic Container Service API Reference. If your containers are using an image hosted with Amazon ECR, the imageDigest field is returned.

But, looking the JSON output, it doesn't have the healthStatus field. Please check below the event JSON for a stopped task without the healthStatus on it:

{
    "version": "0",
    "id": "ffa02779-64c6-efb2-5a9d-405ba9df5d81",
    "detail-type": "ECS Task State Change",
    "source": "aws.ecs",
    "account": "ACCOUNT_ID",
    "time": "2021-09-28T16:32:40Z",
    "region": "eu-west-1",
    "resources": [
        "arn:aws:ecs:eu-west-1:ACCOUNT_ID:task/CLUSTER_NAME/caac32d0520c404a97a73c25945d1c40"
    ],
    "detail": {
        "attachments": [
            {
                "id": "aae56f84-9ced-4625-91c9-81501becdfda",
                "type": "eni",
                "status": "DELETED",
                "details": [
                    {
                        "name": "subnetId",
                        "value": "SUBNET_ID"
                    },
                    {
                        "name": "networkInterfaceId",
                        "value": "ENI_ID"
                    },
                    {
                        "name": "macAddress",
                        "value": "0a:50:51:65:97:91"
                    },
                    {
                        "name": "privateDnsName",
                        "value": "ip-172-31-38-3.eu-west-1.compute.internal"
                    },
                    {
                        "name": "privateIPv4Address",
                        "value": "172.31.38.3"
                    }
                ]
            }
        ],
        "availabilityZone": "eu-west-1b",
        "clusterArn": "arn:aws:ecs:eu-west-1:ACCOUNT_ID:cluster/CLUSTER_NAME",
        "connectivity": "CONNECTED",
        "connectivityAt": "2021-09-28T16:31:06.086Z",
        "containers": [
            {
                "containerArn": "arn:aws:ecs:eu-west-1:ACCOUNT_ID:container/CLUSTER_NAME/caac32d0520c404a97a73c25945d1c40/ccb75d03-5f00-4780-bf4c-023d49ad9aa5",
                "exitCode": 0,
                "lastStatus": "STOPPED",
                "name": "nginx-container0",
                "image": "nginx",
                "runtimeId": "caac32d0520c404a97a73c25945d1c40-1727258831",
                "taskArn": "arn:aws:ecs:eu-west-1:ACCOUNT_ID:task/CLUSTER_NAME/caac32d0520c404a97a73c25945d1c40",
                "networkInterfaces": [
                    {
                        "attachmentId": "aae56f84-9ced-4625-91c9-81501becdfda",
                        "privateIpv4Address": "172.31.38.3"
                    }
                ],
                "cpu": "0",
                "memoryReservation": "128"
            }
        ],
        "cpu": "256",
        "createdAt": "2021-09-28T16:30:53.529Z",
        "desiredStatus": "STOPPED",
        "enableExecuteCommand": false,
        "ephemeralStorage": {
            "sizeInGiB": 20
        },
        "executionStoppedAt": "2021-09-28T16:32:17.244Z",
        "group": "service:hctest",
        "launchType": "FARGATE",
        "lastStatus": "STOPPED",
        "memory": "512",
        "overrides": {
            "containerOverrides": [
                {
                    "name": "nginx-container0"
                }
            ]
        },
        "platformVersion": "1.4.0",
        "pullStartedAt": "2021-09-28T16:31:17.464Z",
        "pullStoppedAt": "2021-09-28T16:31:22.978Z",
        "startedAt": "2021-09-28T16:31:23.981Z",
        "startedBy": "ecs-svc/4597613971827055402",
        "stoppingAt": "2021-09-28T16:32:04.723Z",
        "stoppedAt": "2021-09-28T16:32:40.003Z",
        "stoppedReason": "Task failed container health checks",
        "stopCode": "ServiceSchedulerInitiated",
        "taskArn": "arn:aws:ecs:eu-west-1:ACCOUNT_ID:task/CLUSTER_NAME/caac32d0520c404a97a73c25945d1c40",
        "taskDefinitionArn": "arn:aws:ecs:eu-west-1:ACCOUNT_ID:task-definition/nginx-fargate:10",
        "updatedAt": "2021-09-28T16:32:40.003Z",
        "version": 6
    }
}

Which service(s) is this request for? ECS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? I'd like to receive events notifications whenever my task is stopped due to UNHEALTHY status.

Are you currently working around this issue? I'd like to use this below EventBridge configuration:

{
    "detail": {
        "clusterArn": [ "arn:aws:ecs:eu-west-1:ACCOUNT_ID:cluster/CLUSTER_NAME"],
        "group": ["service:SERVICE_NAME"],
        "healthStatus": ["UNHEALTHY"],
        "lastStatus": ["STOPPED"]
    },
    "detail-type": ["ECS Task State Change"],
    "source": ["aws.ecs"]
}

Removing the healthStatus filter from EventBridge configuration I can receive the notification for all stopped tasks, not only the UNHEALTHY ones.

Additional context Either the documentation should be updated or the JSON event should respect the DescribeTasks API.

henriquesantanati avatar Sep 29 '21 08:09 henriquesantanati

I'm running into this too - this is very unintuitive. Please update the docs or change the JSON object.

prakashsanker avatar Jul 27 '22 10:07 prakashsanker

There also is a discrepancy in "tags" being missing from the object as well.

Is this going to be fixed or is a different schema?

ajgrowney avatar Jun 28 '23 18:06 ajgrowney