Resolving cloudwatch endpoint for all regions
Summary
This PR will aim to have agent resolve the correct AWS Cloudwatch endpoint for all regions in the case where tasks are using awslogs as the log driver type. Previously, we ran into an issue where the current docker version we're relying on is unable to resolve the correct Cloudwatch endpoints for the new regions and as a immediate remediation we're now relying on agent to resolve the endpoints. We should have this behavior be consistent across all regions.
Note: We will be relying on AWS SDK Go V1 to attempt to resolve the correct endpoint (similar to moby). This will need to be updated once we upgrade to AWS SDK Go V2.
Implementation details
- Removed specific region checks to in order to obtain the cloudwatch endpoint
- Relying on
awslogs-regionto obtain the correct region of the Cloudwatch endpoint
Testing
Manual testing
Used the following task definition:
{
"family": "test",
"containerDefinitions": [
{
"name": "awslogs-test",
"image": "busybox",
"cpu": 256,
"memory": 64,
"portMappings": [],
"essential": true,
"command": [
"sh",
"-c",
"echo hello world"
],
"environment": [],
"mountPoints": [],
"volumesFrom": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "test-log-group",
"awslogs-region": "ca-central-1",
"awslogs-stream-prefix": "cw-test"
}
},
"systemControls": []
}
]
}
Task transitioned to running
evel=debug time=2024-05-23T18:38:03Z msg="Transitioned container" task="78959252ccc04cdd88aec25340bb824d" container="awslogs-test" runtimeID="528be3477391117131c42722d6089a0430852f5177fad30a2e949907e2bf51be" nextState="RUNNING" error=<nil>
level=debug time=2024-05-23T18:38:03Z msg="Received non-transition events" task="78959252ccc04cdd88aec25340bb824d"
level=debug time=2024-05-23T18:38:03Z msg="Updating task's known status" task="78959252ccc04cdd88aec25340bb824d"
level=debug time=2024-05-23T18:38:03Z msg="Found container with earliest known status" knownStatus=RUNNING desiredStatus=RUNNING task="78959252ccc04cdd88aec25340bb824d" container="awslogs-test"
level=debug time=2024-05-23T18:38:03Z msg="Updating task's desired status" taskFamily="test" taskVersion="1" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/78959252ccc04cdd88aec25340bb824d" taskKnownStatus="RUNNING" taskDesiredStatus="RUNNING" nContainers=1 nENIs=0
Task container exited gracefully
[ec2-user@ip-172-31-46-255 amazon-ecs-agent]$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
528be3477391 busybox "sh -c 'echo hello w…" 15 seconds ago Exited (0) 14 seconds ago ecs-test-1-awslogs-test-82d4d4a0989bc08b1400
Task was able to successfully write to Cloudwatch log group in ca-central-1 from us-west-2
2024-05-23T18:38:03.408Z hello world
New tests cover the changes: Yes
Description for the changelog
enhancement: Resolving Cloudwatch endpoint in all regions
Does this PR include breaking model changes? If so, Have you added transformation functions?
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.