action-docker-layer-caching icon indicating copy to clipboard operation
action-docker-layer-caching copied to clipboard

Avoid caching images that are retrieved with `pull`

Open axwalker opened this issue 5 years ago • 10 comments

Is your feature request related to a problem? Please describe. The Post Run satackey/[email protected] step is quite slow for our builds because it uploads a bunch of images that we already use pull for.

Describe the solution you'd like Since we pull them anyway, it would be good to not bother caching them at all. Is there a way to only cache those images which are not retrieved through pull.

Describe alternatives you've considered The alternative at the moment is to not use the caching at all, because the cache upload takes longer than the original build normally takes.

axwalker avatar Aug 03 '20 17:08 axwalker

I'm not sure that it's possible to know whether an image was pulled or built locally - could this perhaps be implemented as an option where you can restrict which images are cached based on their name?

charleskorn avatar Aug 06 '20 01:08 charleskorn

Having something like an ignorePattern where you can give a regex for images to ignore would potentially solve our issue.

axwalker avatar Aug 06 '20 08:08 axwalker

In the main run of this action, the action saves the list of images that exist, which are not cached in the post run. They will be excluded from the post run cache. (Because of this, cached container images in the hosted runnner are not cached.)

    steps:
    # not cached steps

    - uses: satackey/[email protected]

    # cached steps

I think pulling before uses: satackey/[email protected] would solve this problem, but any special situations?

satackey avatar Aug 08 '20 00:08 satackey

It's also currently re-pushing all the layers that it fetched from the cache whenever any layer changes. I think this is the main reason it ends up being so slow even for builds that are 90% cache hits.

ForbesLindesay avatar Aug 12 '20 23:08 ForbesLindesay

I think pulling before uses: satackey/[email protected] would solve this problem, but any special situations?

One downside to that is that it forces you to pull before building. That might not be ideal if you're pulling a large image that's used to test the image you're building, but the build fails. You'd wait for the slow pull to complete before finding out that the build failed so the pulled image is not needed any more.

It's also currently re-pushing all the layers that it fetched from the cache whenever any layer changes. I think this is the main reason it ends up being so slow even for builds that are 90% cache hits.

#98 is a suggestion for how to avoid this problem. It will avoid reuploading the layer content for cache hits, as well as allowing safe sharing of cached layers between workflows.

rcowsill avatar Dec 05 '20 00:12 rcowsill

Not everyone builds images. For example, we are pulling images from DockerHub because we use them. We are not building them at all.

So not caching pulled images makes this cache completely useless for anyone with that use case.

andy-maier avatar Feb 04 '21 06:02 andy-maier

What about the images that are already present in the runners. For instance, I'm working with windows runner, is it possible to avoid caching pre-installed images? image

Also, one of my docker image build is about 5 min.

image

As u can see the caching action post run takes 15 mins. After re-running the same workflow, it takes almost 13 mins to download the cache, the build itself is blazing fast 2 sec. However 13 min + the post run action that is still running atm and is at 7+ mins. It's way more then the original 5 mins without caching.

image

I'm testing it now with another workflow that takes much more time to see if I gain time.

I am doing anything wrong? I used the action at the second step of the jobs, after the checkout.

EDIT: just noticed this is related to this

MostefaKamalLala avatar Jun 04 '21 04:06 MostefaKamalLala

@MostefaKamalLala The pre-installed images are automatically skipped when writing the cache; your first screenshot shows the action detecting those images before your docker build step.

I think something else is causing the cache "Post run" to be so slow, but don't know what it could be without seeing the debug logs for that part of the run. Can you share a link to your run logs?

rcowsill avatar Jun 04 '21 11:06 rcowsill

@MostefaKamalLala The pre-installed images are automatically skipped when writing the cache; your first screenshot shows the action detecting those images before your docker build step.

I think something else is causing the cache "Post run" to be so slow, but don't know what it could be without seeing the debug logs for that part of the run. Can you share a link to your run logs?

Yes of course, here is the log. logs_6468.zip

MostefaKamalLala avatar Jun 04 '21 13:06 MostefaKamalLala

Ok, your dockerfile starts with FROM mcr.microsoft.com/dotnet/framework/wcf:4.7.2-windowsservercore-ltsc2019, and that image isn't pre-installed. That means it gets pulled on the first build, and also gets cached as a result. It looks like it's a pretty big image too; I pulled the closest version compatible with my machine and it's about 14Gb unpacked.

This can be avoided by pulling that image before the Run satackey/[email protected] step.

rcowsill avatar Jun 04 '21 15:06 rcowsill