amazon-ecs-agent
amazon-ecs-agent copied to clipboard
Use resolved digest for image pulls
Summary
This PR updates image pull logic to use a resolved image manifest digest if one is available. Image manifest digests are resolved during container transition to MANIFEST_PULLED state. The change ensures that the pulled image is the same as pointed by the resolved digest.
Implementation details
- Add a new method
TagImagetoDockerClientinterface and its implementation. The method tags an image on the host. The implementation performs retries using a newConstantBackoffstrategy that backs off the same duration every time. A constant backoff retry strategy is fine in this case as there is no external service involved. - Add a new
ConstantBackoffbackoff strategy underecs-agentmodule. The strategy returns the same amount of backoff duration regardless of how many times itsDurationmethod is called. - Update
*DockerTaskEngine.pullAndUpdateContainerReferencemethod that is used for pulling container images so that it uses the container'sImageDigestfield to prepare a canonical reference to the image to be pulled. The method tags the pulled image withContainer.Imageif a different image reference was used to pull the image so that image caching and image cleanup continue to work as before. - Add a new method
GetCanonicalReftoagent/utils/referencepackage that returns a canonical image reference given an image reference and a manifest digest. - Test updates and new tests.
Testing
- Added a new integration test named
TestPullContainerWithAndWithoutDigestIntegto check that*DockerTaskEngine.pullContainercan pull images for containers with and without anImageDigestset. - Added a new integration test named
TestPullContainerWithAndWithoutDigestConsistencyto check that*DockerTaskEngine.pullContainerpulls the same image with or without a digest set and the image can be inspected withcontainer.Imagefield in both cases.
In addition to the integration tests above, performed the following manual tests.
- Ran a variety of tasks with Agent configured to use
alwaysimage pull behavior. Checked that all tasks ran as expected. Images were pulled using digests and tagged with the image reference in the task definition. Images were cleaned up without any issues. - Ran a variety of tasks with Agent configured to use
onceand thenprefer-cachedimage pull behavior. Checked that all tasks ran as expected. Cached images were used in both cases when found. Image cleanup worked as expected withonceimage pull behavior. Image pull is disabled whenprefer-cachedimage pull behavior is used. - Ran a simple task multiple times with an Agent built with changes in this PR and again with an Agent built against master branch. Both Agents were configured to use
alwaysimage pull behavior to force image pulls. Measured the task average start times (startedAt-createdAt) and task pull times (pullStoppedAt-pullStartedAt) for both cases. Changes to resolve image manifest digest in https://github.com/aws/amazon-ecs-agent/pull/4152 caused an additional delay in task start times that ranged from 300ms (ECR) to 900ms (Dockerhub), however, with this PR the image pulls are now slightly faster. Pull time for an image that's already available on the host is reduced from ~700ms (Dockerhub), ~250ms (public ECR), and ~130ms (private ECR) to ~260ms (Dockerhub), ~100ms (public ECR), and ~50ms (private ECR). Combined with changes to resolve image manifest digests (#4152) the overall average increase in task start time in my test environment is ~500ms (Dockerhub) and ~150ms (ECR).
New tests cover the changes: yes
Description for the changelog
Does this PR include breaking model changes? If so, Have you added transformation functions?
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.