containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[ECR] [request]: pull through cache repositories for dockerhub and gcr

Open runningman84 opened this issue 2 years ago • 18 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request Please add support for dockerhub and gcr registries in this new feature: https://aws.amazon.com/de/blogs/aws/announcing-pull-through-cache-repositories-for-amazon-elastic-container-registry/

Which service(s) is this request for? ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Most images are stored in docker hub and users are getting rate limit errors all the time...

Are you currently working around this issue? Syncing these docker hub images to gcr and ecr

runningman84 avatar Nov 30 '21 09:11 runningman84

hi @runningman84 I assume this is just for public, unauthenticated access? I've created another issue here for authenticated access to registries. This includes registries that require authentication to receive higher pull limits https://github.com/aws/containers-roadmap/issues/1584

srrengar avatar Nov 30 '21 19:11 srrengar

Yes I am talking about public unauthenticated access...

runningman84 avatar Dec 01 '21 10:12 runningman84

For any upstream registry that limits or throttles unauthenticated pulls per hour by IP address, the issue is that we (the ECR team) are making those pulls from our service account and not from the customer's VPC. So what would end up happening with such upstream registries is that the pulls from the upstream registry to each customer's private ECR registry would end up severely throttled for all customers within that region, effectively disabling the pull-through-cache capability. The better solution for such registries is for ECR to pull from the upstream registry on the customer's behalf using the customer's credentials with the upstream registry. While there still be be throttling in this case, the throttling will be based on the customer's credentials with the upstream registry and won't affect other customers using the same registry with their own separate credentials or account. See https://github.com/aws/containers-roadmap/issues/1584.

coultn avatar Dec 01 '21 17:12 coultn

Or, y'know, Amazon could pay Docker for a Docker hub account commensurate with their volume of requests and ability to pay...

paulgear avatar Dec 02 '21 23:12 paulgear

@coultn I can see your point. I've requested that ECR should add (authenticated) pull-through support for GitHub Container Registry (ghcr.io) over at #1584 but we also need pull-through from ghcr.io for ostensibly public images, such as KEDA releases.

What is the most effective way to request pull-through support for new container registries?

joebowbeer avatar Feb 22 '22 20:02 joebowbeer

Thanks for the input @joebowbeer . We actually pay pretty close attention to the feedback we get here. We do have some things planned on our roadmap based on the feedback we've heard and we will share more plans here when we have a bit more certainty.

coultn avatar Feb 22 '22 20:02 coultn

@coultn do you also consider pull-through for private ECR registries as an upstream? I'm talking about use-case: having several deployments across multiple AWS regions. Main ECR repo is in us-west-2. I don't want to do a full replication into eu-central-1, but I would love to have a pull-through cache for requested images.

mwos-sl avatar Mar 23 '22 22:03 mwos-sl

@coultn do you also consider other repositories as well? e.g. mcr.microsoft.com?
Idea is to have control over these images, e.g. scan for security issues in an automated way. And also to only allow certain images to be downloaded in the internal build processes. To work around this, we would have to build something to first fetch it from outside in a controlled way, and then store it to our private ecr.

zenzs avatar Oct 20 '22 07:10 zenzs

There is a workaround you can apply until AWS adds the option for Docker Hub. Pull through caches currently only support ECR Public, Quay and Kubernetes. However there are a lot of images that are only kept up to date on Docker Hub.

There are two solutions, if you find a image on ECR public that is prefixed with docker/library you can use it and it will be kept up in sync with Docker Hub, for a lot of popular images this should be the case.

For example Busybox, both are at 1.36:

  • ECR: https://gallery.ecr.aws/docker/library/busybox
  • Docker Hub: https://hub.docker.com/_/busybox

You will run into problems when you have images that only exist on Docker Hub, for instance Cloudflare Agent, it is on version 1449-569a7c3c9ed0 on Docker Hub:

  • ECR only has six images from unofficial users, so you are not sure what you are pulling, also they are all older versions
  • Docker Hub: https://hub.docker.com/r/cloudflare/cloudflared

I made a Terraform module that periodically copies Docker Hub images to your own private ECR registry. It uses your own Docker Hub username and token because the rate limits for unauthenticated users are IP based and these IPs might be shared by different AWS services.

You can find more information about this Terraform module here.

PS - Pulling images from Docker Hub is also slow because it goes over the internet (so most likely through your NAT gateway). Pulling images from ECR in AWS (through an VPC endpoint) is much faster than pulling from Docker Hub, so you might also want to consider it for performance reasons.

alexjeen avatar Oct 05 '23 15:10 alexjeen

Is there any remote chance of this being available in 2023/24? This is the most problematic issue when trying to migrate from private docker-registry/harbor to Amazon managed ECR

nightswimmings avatar Nov 15 '23 10:11 nightswimmings

This is released. https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-ecr-pull-through-cache-additional-upstream-registries/

azN2 avatar Nov 18 '23 15:11 azN2

Thanks @azN2! I mean for dockerhub

nightswimmings avatar Nov 22 '23 14:11 nightswimmings

@nightswimmings

Thanks @azN2! I mean for dockerhub

https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-ecr-pull-through-cache-additional-upstream-registries/

Amazon Elastic Container Registry (ECR) now includes Docker Hub, Azure Container Registry, and GitHub Container Registry as supported upstream registries for ECR’s pull through cache feature.

brokenthumbs avatar Nov 22 '23 14:11 brokenthumbs

Oh my oh my, what a coincidence... I must be the luckiest dev online Sorry for not understaing you at first, it is unbelievable provided the feature is deployed this week after my inquire, provided the ticket has 2 years Amazing

nightswimmings avatar Nov 22 '23 14:11 nightswimmings

It is great to see my feature request implemented!

Support for unauthenticated github image access would be also great. We store a few public images in github and we might not want to use authentication here.

runningman84 avatar Nov 22 '23 16:11 runningman84

Awesome, this is great! Is there any ETA for when GCR will be supported?

entscheidungsproblem avatar Jan 03 '24 17:01 entscheidungsproblem

Awesome feature! Please consider support for nvcr.io from Nvidia. NVCR images are used in popular charts as https://github.com/NVIDIA/k8s-device-plugin, https://github.com/NVIDIA/dcgm-exporter and others.

vtatarin avatar Jan 23 '24 20:01 vtatarin