containers-roadmap
containers-roadmap copied to clipboard
[ECR] [request]: pull through cache repositories for dockerhub and gcr
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Tell us about your request Please add support for dockerhub and gcr registries in this new feature: https://aws.amazon.com/de/blogs/aws/announcing-pull-through-cache-repositories-for-amazon-elastic-container-registry/
Which service(s) is this request for? ECR
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Most images are stored in docker hub and users are getting rate limit errors all the time...
Are you currently working around this issue? Syncing these docker hub images to gcr and ecr
hi @runningman84 I assume this is just for public, unauthenticated access? I've created another issue here for authenticated access to registries. This includes registries that require authentication to receive higher pull limits https://github.com/aws/containers-roadmap/issues/1584
Yes I am talking about public unauthenticated access...
For any upstream registry that limits or throttles unauthenticated pulls per hour by IP address, the issue is that we (the ECR team) are making those pulls from our service account and not from the customer's VPC. So what would end up happening with such upstream registries is that the pulls from the upstream registry to each customer's private ECR registry would end up severely throttled for all customers within that region, effectively disabling the pull-through-cache capability. The better solution for such registries is for ECR to pull from the upstream registry on the customer's behalf using the customer's credentials with the upstream registry. While there still be be throttling in this case, the throttling will be based on the customer's credentials with the upstream registry and won't affect other customers using the same registry with their own separate credentials or account. See https://github.com/aws/containers-roadmap/issues/1584.
Or, y'know, Amazon could pay Docker for a Docker hub account commensurate with their volume of requests and ability to pay...
@coultn I can see your point. I've requested that ECR should add (authenticated) pull-through support for GitHub Container Registry (ghcr.io) over at #1584 but we also need pull-through from ghcr.io for ostensibly public images, such as KEDA releases.
What is the most effective way to request pull-through support for new container registries?
Thanks for the input @joebowbeer . We actually pay pretty close attention to the feedback we get here. We do have some things planned on our roadmap based on the feedback we've heard and we will share more plans here when we have a bit more certainty.
@coultn do you also consider pull-through for private ECR registries as an upstream? I'm talking about use-case: having several deployments across multiple AWS regions. Main ECR repo is in us-west-2. I don't want to do a full replication into eu-central-1, but I would love to have a pull-through cache for requested images.
@coultn do you also consider other repositories as well? e.g. mcr.microsoft.com?
Idea is to have control over these images, e.g. scan for security issues in an automated way.
And also to only allow certain images to be downloaded in the internal build processes.
To work around this, we would have to build something to first fetch it from outside in a controlled way, and then store it to our private ecr.
There is a workaround you can apply until AWS adds the option for Docker Hub. Pull through caches currently only support ECR Public, Quay and Kubernetes. However there are a lot of images that are only kept up to date on Docker Hub.
There are two solutions, if you find a image on ECR public that is prefixed with docker/library you can use it and it will be kept up in sync with Docker Hub, for a lot of popular images this should be the case.
For example Busybox, both are at 1.36:
- ECR: https://gallery.ecr.aws/docker/library/busybox
- Docker Hub: https://hub.docker.com/_/busybox
You will run into problems when you have images that only exist on Docker Hub, for instance Cloudflare Agent, it is on version 1449-569a7c3c9ed0 on Docker Hub:
- ECR only has six images from unofficial users, so you are not sure what you are pulling, also they are all older versions
- Docker Hub: https://hub.docker.com/r/cloudflare/cloudflared
I made a Terraform module that periodically copies Docker Hub images to your own private ECR registry. It uses your own Docker Hub username and token because the rate limits for unauthenticated users are IP based and these IPs might be shared by different AWS services.
You can find more information about this Terraform module here.
PS - Pulling images from Docker Hub is also slow because it goes over the internet (so most likely through your NAT gateway). Pulling images from ECR in AWS (through an VPC endpoint) is much faster than pulling from Docker Hub, so you might also want to consider it for performance reasons.
Is there any remote chance of this being available in 2023/24? This is the most problematic issue when trying to migrate from private docker-registry/harbor to Amazon managed ECR
This is released. https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-ecr-pull-through-cache-additional-upstream-registries/
Thanks @azN2! I mean for dockerhub
@nightswimmings
Thanks @azN2! I mean for dockerhub
https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-ecr-pull-through-cache-additional-upstream-registries/
Amazon Elastic Container Registry (ECR) now includes Docker Hub, Azure Container Registry, and GitHub Container Registry as supported upstream registries for ECR’s pull through cache feature.
Oh my oh my, what a coincidence... I must be the luckiest dev online Sorry for not understaing you at first, it is unbelievable provided the feature is deployed this week after my inquire, provided the ticket has 2 years Amazing
It is great to see my feature request implemented!
Support for unauthenticated github image access would be also great. We store a few public images in github and we might not want to use authentication here.
Awesome, this is great! Is there any ETA for when GCR will be supported?
Awesome feature! Please consider support for nvcr.io from Nvidia. NVCR images are used in popular charts as https://github.com/NVIDIA/k8s-device-plugin, https://github.com/NVIDIA/dcgm-exporter and others.
I'd love to setup a pull through cache for mcr.microsoft.com aka Microsoft Artifact Registry.
Our CI pipelines could be faster and more reliable if we could pull these images from ECR:
- mcr.microsoft.com/dotnet/sdk
- mcr.microsoft.com/dotnet/aspnet
- mcr.microsoft.com/playwright
It would be great if this issue could be updated to include the Microsoft Artifact Repository as there public ecr gallery is filled with unknown publishers of the images and bitnami. Bitnami is good but they only cover the debian based releases and there's no ability to cache the windows containers.
Ironically one of the docker images that I am trying to pull and cache is utilized in the aws-ebs-csi-driver
helm-chart (https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/charts/aws-ebs-csi-driver/values.yaml ). Being this is an AWS helm chart, perhaps this registry should be updated or the image can be pulled from an AWS ecr repository?