containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[ecr] [request]: cache layers between repositories

Open jespersoderlund opened this issue 6 years ago • 21 comments

Tell us about your request Currently ECR doesn't cache image layers between repositories and with ECRs model of creating a repo per image this leads to quite poor performance, especially in situations where there are many images being built on a common base-image.

Each image would be a separate upload taking the full volume of the image taking significantly longer to push/replicate. This becomes an issue with a globally distributed architecture where 100s of services built on a common base-image needs to be replicated to remote regions.

In most other docker repositories the model is different with a single repository serving multiple images.

Which service(s) is this request for? ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? We want to have quick push/replication of images used for micro services built on a common base-image (which becomes layers in the service-specific images) across regions.

Are you currently working around this issue? Trying to parallelise request as much as possible, but this is causing unnecessary cost and still wont be as quick as if the shared layers could be reused across repos.

Additional context

Attachments

jespersoderlund avatar Oct 11 '19 08:10 jespersoderlund

This is a real pain point for us. Current workaround is to separate images using tags instead of repositories, but it's ugly! Docker Hub an Artifactory solves this without problem and stops unnecessary uploading if the layer already exists.

jswetzen avatar Feb 20 '20 12:02 jswetzen

This would save a lot of bandwidth too, not having to push/pull the same base layers every time..

richstokes avatar Sep 21 '20 20:09 richstokes

Any news on this? Would love to be able to use this.

slimshreydy avatar Jul 26 '21 20:07 slimshreydy

In my opinion, ECR is not usable without this feature.

tjb801 avatar Aug 11 '21 17:08 tjb801

Come on guys! GCP has this!

lkishalmi avatar Aug 10 '22 16:08 lkishalmi

Any updates on this?

hipstern avatar Aug 31 '22 22:08 hipstern

Seriously, how ECR can be still missing such an essential feature in 2022?

sergiy-kozak avatar Oct 15 '22 22:10 sergiy-kozak

Seriously, how ECR can be still missing such an essential feature in 2022?

I'm trying not to be cynical about it, as AWS does benefit financially from this inefficiency, but when will this get accepted to the upcoming delivery schedule? We just passed this issue's 3rd anniversary and we're rapidly approaching ECR's 7th anniversary, and yet this basic level of layer caching is still missing.

Is there some policy based security needed which is causing complex inter- and intra-account cross repository access delaying this? Cross region replication woe? Any of those types of problems would be understandable. Official information or direction about this would be very helpful at this point.. Thanks!

jessefarinacci avatar Oct 16 '22 13:10 jessefarinacci

Hello from 2023 :wave:

Layers and caches are essential to containers to save storage, bandwidth, cost, and time. This has been a critical component of container architecture since the beginning.

four43 avatar Jan 17 '23 22:01 four43

So far I haven't been able to determine that they're using actual disk space usage versus just multiplying the # of images by their individual sizes (without taking layer re-use into account within individual repositories).

This is so long overdue.. don't charge us multiple times to store the same layers within 1 registry, especially for the base images.

matthenry87 avatar Apr 25 '23 21:04 matthenry87

We are ignoring this @aws ?

hfawaz avatar Oct 26 '23 17:10 hfawaz

Just opened a Support ticket on this, and was directed here to +1. So, +1.

BroMattMiller avatar Dec 01 '23 13:12 BroMattMiller

So far I haven't been able to determine that they're using actual disk space usage versus just multiplying the # of images by their individual sizes

You could inspect the response when you attempt to download the blobs (or invoke the GetDownloadUrlForLayer API directly). It returns a redirect to an S3 presigned URL, you could see if it resolves to the same S3 object.

blowfishpro avatar Dec 01 '23 18:12 blowfishpro

Due to possibility to individually encrypt repos using different keys, I am 99% sure they really don't share images. I believe this repo-level granularity is the major blocker to cross-repo layer-sharing, btw.

systematicguy avatar Dec 01 '23 18:12 systematicguy

KMS keys are configured per-repository.

I believe many of us are hoping for sharing of layers between images within the same repository - a core feature of docker and seemingly a cheap money grab if not actually implemented that way.

four43 avatar Dec 13 '23 14:12 four43

Beware of the terminology. I was coming from the Artifactory world and was confused for a long time until I realized repo, registry is not the same in ecr.

KMS keys can be set up per ecr repo. All the private ecr repos make up your per-account single private ecr registry.

This is not the same as e.g. in artifactory, where you can have multiple registries where layer-sharing is inherent.

In ecr you proactively have to configure each repo (tag immutability, kms key, etc), whereas in artifactory you just push whatever image name you push to a registry.

systematicguy avatar Dec 13 '23 15:12 systematicguy

My point is: say you have 50 ecr repos where you push ubuntu-based images. You will need to store the base image 50 times! Yes, within one repo the layer will be reused but not across the others, due to the exposed possibility of separate configuration, IAM access, etc.

I don't say this is necessary and good, just share my understanding. I would be also happy to drop this border between repos in favor of multiple registries and layer reuse within one.

systematicguy avatar Dec 13 '23 15:12 systematicguy

I understand what you're saying. I agree that your use case is an even broader and may not be possible due to those limitations you explained. The core of this issue is layer re-use between repository. Which is a step before cross-repository layer re-use, IMO.

Terms used are AWS terms, since this is an AWS issue.

four43 avatar Dec 13 '23 15:12 four43

Hello from 2023 👋

Layers and caches are essential to containers to save storage, bandwidth, cost, and time. This has been a critical component of container architecture since the beginning.

Hello from 2024 👋 Time lapsing, AWS continuing to make money from its own inefficiency.

dene14 avatar Apr 28 '24 13:04 dene14

how can I vote for this?

golosegor avatar Jul 03 '24 13:07 golosegor