micromamba-docker icon indicating copy to clipboard operation
micromamba-docker copied to clipboard

Use Docker cache to prevent superfluous image builds

Open maresb opened this issue 2 years ago • 6 comments

The cache-from for the docker build step refers to mambaorg/micromamba:latest. Thus only the default base image of debian:bullseye-slim has a usable cache. As a consequence, the non-default images are rebuilt from scratch due to commits to main which are irrelevant to the image.

I think that cache-from should be set to something like ${{ steps.set_image_variables.outputs.tag }}, but I'm not sure.

maresb avatar Mar 16 '23 19:03 maresb

Good catch. Fixed in #283. Where I also pull the cache from ghrc in hopes that it is faster than dockerhub.

wholtz avatar Mar 20 '23 05:03 wholtz

Thanks for fixing the cache-from tag!

Unfortunately the cache still isn't working. Since I wrote this issue as an XY problem, I'll take a step back...

Expected behavior: commits to main which don't modify image-relevant files (e.g. modifications to docs or GH workflows) should result in the cache being used, and consequently the image should have the same SHA as the previous image. Consequently, the Git-based tags should be added to existing images in the image registry.

Observed behavior: commits to main which don't modify image-relevant files result in new images.

I suspect that this is due to the multi-stage build. Since we push only the main stage, the build process has no cache for stage1.

As for a solution, perhaps we need to create a new and separate image for stage1 and push that for cache. Perhaps this could be broken out into a separate job inside the push_latest.yml workflow, and the matrix would be just the list of platforms. Finally, since cache-from is a list, we should be able to add both stage1 and ...outputs.tag as source images. And then if we're lucky, everything will just work. :laughing:

As for more context, the particular reason I'm noticing this problem is that in micromamba-devcontainer I have a workflow where I run this script as a cronjob. The script updates the base image with the new [tag]@[sha], but only in case the sha has changed. Consequently, I receive a new PR each time the micromamba-docker image is rebuilt, which has been happening more often that I expected.

maresb avatar Mar 20 '23 14:03 maresb

I have looked into this a little. Still need to spend more time on it. The one thought I had is that the debian and ubuntu base images get updated about once a month (some times more frequently if there are critical security updates). Those updates should break the cache. But I just looked through a bunch of the image digests and it does appear that every PR is resulting in a new digest.

wholtz avatar Mar 29 '23 15:03 wholtz

Ya, I think ideally we would explicitly specify the sha256 of the debian/ubuntu images in the Dockerfile. That would have the added benefit of making cache-breaking explicit. We might be able to repurpose some portion of my aforementioned Python script. (Feel free to copy from it if you'd like.)

maresb avatar Mar 29 '23 18:03 maresb

It appears that caching of the first stage is now working, but for some reason the second stage thinks the cache is completely invalid. Not sure why.

wholtz avatar Apr 20 '23 22:04 wholtz

https://github.com/moby/buildkit/issues/2822

wholtz avatar Feb 26 '24 05:02 wholtz