kaniko icon indicating copy to clipboard operation
kaniko copied to clipboard

WORKDIR learned to cache it's potential output layer

Open mzihlmann opened this issue 4 months ago • 0 comments

Fixes #3340

Description

When WORKDIR is called on a non-existent directory, kaniko is kind enough to create that directory for you, resulting in a layer being added. However, kaniko does not cache that layer, which means that on every invocation a completely new image is emitted from that point onwards. Inside the same stage this is non-obvious as caching mechanism still pulls, so you get a 100% cache hitrate thereafter, but the image is completely new. In multistage builds or builds that depend on the newly emitted image, this is catastrophic, as they do consider the entire image's sha when determining whether a cache is hit or not, so this will invalidate the entire cache.

So far the workaround was to ensure that the directory exists before calling WORKDIR to avoid creating it implicitly, as RUN statements can be cached:

RUN mkdir /app
WORKDIR /app

With this change the layer potentially created by WORKDIR is cached too in similar vein to how RUN statements are cached.

There is some optimization potential left on the table here, as we do sometimes know a-priori whether a layer should be created at all and always know which directory. Currently I copied the code from RUN to make it work, but this is suboptimal, as this code assumes no a-priori knowledge. I'm open for suggestions.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you review them:

  • [ ] Includes unit tests
  • [ ] Adds integration tests if needed.

See the contribution guide for more details.

Reviewer Notes

  • [ ] The code flow looks good.
  • [ ] Unit tests and or integration tests added.

Release Notes

  • kaniko learned to cache layers created by WORKDIR

mzihlmann avatar Oct 13 '24 04:10 mzihlmann