buildkit Proposal: Allow setting cache expiration time for build steps

There are certain build steps that do not generate a very stable cache key. Eg. container runs are mainly cached by the command arguments. For commands that access the network, their meaning may completely change over time without the arguments changing. A simple example is installing a package and running something like apt-get update. The cache generated for this step never expires unless updated with a build-arg or --no-cache[-filter] is used.

Ideally, we would want to avoid these commands and define package installs more precisely so BuildKit can track the dependencies and their expiration correctly. This can be done with custom frontends but is hard in something like Dockerfile.

A possible solution would be to allow build step to define the expiration time for the cache that it generates.

The simplest way to implement it would be to save the creation and expration time when the cache record is created(creation time is already saved) and then delete the records when they expire. There are multiple reasons why this doesn't work with the current cache model, though.

First of all, BuildKit cache chains do not require cache records to be available for intermediate steps (this is how BuildKit cache can step over stages in multi-stage builds, for example). This means that it is unlikely that the cache for apt-get update will ever be pulled, and question is more about pulling the cache for any step that depended on apt-get update result later on in the build. But the cache records for these steps could be created much later.

Secondly, if apt-get update does run again (either by no-cache match or because it was expired), its result is saved by the same cache key as the previous run. Meaning that the cache for the next steps would automatically be usable on top of it.

Thirdly, removing content doesn't work with remote cache backends, where we can't do any data modifications after the build.

An alternative approach would be to make changes to how the cache chains are stored in https://github.com/moby/buildkit/blob/master/solver/cachestorage.go#L16 .

Currently, when a cache record is saved, it calls AddResult() function to register a snapshot with a specific cache location. When a process from the same cache location runs multiple times, these snaphots are added under the same record. On snapshot load, the client can query all records and usually picks the newest one.

For the expiration to work, snapshots from different execution times can't be saved under the same cache location but need to create a separate branch point in the cache graph. First of all, we would need to distinguish that the cache key is for an LLB Exec operation, as for only these the expiration makes sense. Then, for such keys, instead of calling AddResult() with the previous ID value, a completely new ID needs to be generated, and creation time needs to be associated with this key. The creation time for this key can never change in the future, so no new records are allowed to be added under the same key. The same creation time is also added to AddLink https://github.com/moby/buildkit/blob/master/solver/cachestorage.go#L26 to add extra time constraints to cache key links.

WalkLinks method https://github.com/moby/buildkit/blob/master/solver/cachestorage.go#L27 gets called when a cache match query is happening. This method needs to be extended with the maxAge time.Duration property. Every key associated with the creation time(from AddLink) is compared against this maxAge and links only return if they are new enough. This makes sure that any future matches for other steps also need to be newer than this record, and even if they are newer, they can't be based on any older cache in their parent chain from that point.

This method should also work for remote cache backends as long as the creation times for cache locations are exported.

In Dockerfile setting, cache expiration could look something like:

FROM debian
# cache=12h
RUN apt-get update
RUN go build ...

It could also make sense to think about some categorization of the cache records. This proposal is already separating the cache for LLB Exec from other cache that guarantees better uniqueness. Another idea could be to distinguish steps with --network=none (or if we could detect that step did not use the network when it ran) from steps that make network requests and are therefore less trustworthy. Some similar separation could be imagined for cache from CI and local cache (or in the future between signed and unsigned cache).

Sep 29 '23 18:09 tonistiigi

# cache=12h
RUN apt-get update
RUN go build ...

Does the cache expiry in this case only apply to the 1st RUN instruction?

Sep 29 '23 18:09 dvdksn

@dvdksn It only applies to the apt-get command, but in this example, if apt-get runs again, it would mean that go build would also run again just because the dependencies of go build have changed (the same is not true for multi-stage build links).

Sep 29 '23 18:09 tonistiigi

Not commenting on the proposal yet, but I think a syntax of RUN --cache=12h is probably more appropriate considering some of the other syntax.

Sep 29 '23 19:09 jsternberg

Not commenting on the proposal yet, but I think a syntax of RUN --cache=12h is probably more appropriate considering some of the other syntax.

Agree with @jsternberg but would like a more meaningful flag like --cache-ttl.

FROM debian
# cache=12h
RUN apt-get update
RUN go build ...

If we want to use comments in Dockerfile, I think we should have a prefix like:

FROM debian
# org.mobyproject.buildkit.cache-ttl=12h
RUN apt-get update
RUN go build ...

Sep 29 '23 23:09 crazy-max

I like this idea :smile:

Secondly, if apt-get update does run again (either by no-cache match or because it was expired), its result is saved by the same cache key as the previous run. Meaning that the cache for the next steps would automatically be usable on top of it.

I'm not super familiar, why is this the case? Is there a legitimate reason for the behavior, I can't quite work out why an apt-get update result would get saved, but not update the cache key. If you have any code links, would be interested in reading more.

Every key associated with the creation time(from AddLink) is compared against this maxAge and links only return if they are new enough

So the control of expiry is down to the consumer then, as opposed to the creator of the cache record? I think that makes sense, but it does mean that if user A creates a cache for an apt update, user B can use it - even if it's older than how long user A intended it to be around for.

I don't think a --cache-ttl flag would be right then - users would associate that with DNS-like semantics, essentially how long the cache lives, which isn't quite right. It's about how long the cache can be consumed for.

Sep 30 '23 09:09 jedevc

I'm not super familiar, why is this the case? Is there a legitimate reason for the behavior, I can't quite work out why an apt-get update result would get saved, but not update the cache key. If you have any code links, would be interested in reading more.

The result of digest in CacheMap() will always be the same for apt-get update https://github.com/moby/buildkit/blob/b9fcd7688a5d7d4461f0c1706216b6d4b3e584e6/solver/llbsolver/ops/exec.go#L153 . If you run it 3 times you will get 3 snapshots, but they all have the same cache checksum. The condition for matching cache is to find the chain of checksums from past execution that matches the current execution.

So the control of expiry is down to the consumer then, as opposed to the creator of the cache record?

Yes, correct. By this proposal, # cache=12h on a specific RUN statement does not do anything if there isn't a cache match. The creation time is saved for any RUN, and the cache expiry condition is only evaluated when deciding if the previous cache should be loaded or not ("is there any cache record available that is not older than 12h?").

If user A creates cache for apt-get update and had # cache=12h condition in the Dockerfile and now user B has another Dockerfile with apt-get update without any cache restriction, B will still get cache after 12h has passed.

Oct 01 '23 02:10 tonistiigi

This one is probably very related here; it's worth going through that thread to see if there's use-cases there that can help with the design;

https://github.com/moby/moby/issues/1996

Oct 01 '23 04:10 thaJeztah

If you run it 3 times you will get 3 snapshots, but they all have the same cache checksum. The condition for matching cache is to find the chain of checksums from past execution that matches the current execution.

Ahhh right, yes, thanks for the clarification! I was getting confused, thinking that maybe ComputeDigestFunc would be used here - but that's only used to calculate the slow cache path, it doesn't invalidate the existing fast cache path.

Oct 01 '23 05:10 jedevc

Also interested in @tianon and @yosifkit's input on this proposal (I know @tianon had feedback / ideas on https://github.com/moby/moby/issues/1996, and both have dealt with many Dockerfiles for the official images, so may have requirements / thoughts on UX / feature)

Oct 14 '23 16:10 thaJeztah

/cc @dgageot as well; perhaps you can add your use-case (ADD from remote and slow network connections)

Jan 04 '24 13:01 thaJeztah

Sorry for the delay in getting to looking at this! :bow:

I'm actually struggling to understand real-world use cases here. :see_no_evil:

In the case of apt-get update, it's really almost always only necessary if you're going to apt-get install, for which you always need fresh package lists, hence apt-get update, so I really don't see a compelling reason to have RUN apt-get update all by itself in the first place (even running apt-get update explicitly at all is only because APT doesn't have something like apk --no-cache), so I'd love to hear more use cases. :sweat_smile:

Jan 22 '24 22:01 tianon

@tianon The example above meant that any command depending on the new files from apt update would need to run again as well, eg. install. There is no special rules for one RUN step skipping cache and then one after that picking up cache again. Yes, in this case the ideal version would be if apt update result would figure out if there was anything in that updated run that actually impacted the next apt install. This can theoretically be done with some multi-stage construction where after apt update some file is made that defines uniqueness that is then copied to another stage. Then the cache reuse for commands in that other stage would be defined based on the checksum of that "uniqueness" file. This will probably be quite complex though without some code-reuse and custom frontend like https://github.com/tonistiigi/buildkit-alpine would work better.

Ideally, we would want to avoid these commands and define package installs more precisely so BuildKit can track the dependencies and their expiration correctly. This can be done with custom frontends but is hard in something like Dockerfile.

Jul 16 '24 01:07 tonistiigi

Sorry for the delay in getting to looking at this! 🙇

I'm actually struggling to understand real-world use cases here. 🙈

In the case of apt-get update, it's really almost always only necessary if you're going to apt-get install, for which you always need fresh package lists, hence apt-get update, so I really don't see a compelling reason to have RUN apt-get update all by itself in the first place (even running apt-get update explicitly at all is only because APT doesn't have something like apk --no-cache), so I'd love to hear more use cases. 😅

@tianon, @tonistiigi referred me here after a conversation in slack, so I just wanted to weigh in: it's very important for a few steps in our pipeline to get rerun every day -- apt is one, because we want e.g. security updates, and there are also internal packages that we consume as unversioned dependencies. At the same time, those installs are expensive and we build these images on every commit, so I don't want to throw out caching for them entirely. The ideal configuration would be a a cache that we could configure to expire e.g. every 24h.

UPDATE: I ultimately got the behavior I wanted by

adding a built-at annotation to all my images
pulling the build time of the most recent deps build by checking the annotation with the registry API
injecting a RUN echo "{BUILD_TIME_OF_DEPS}" > /deps-built-at into whatever dockerfile my script is building

It works, but it is clunky and it took time -- a cache expiry feature would have totally eliminated the need for this.

Jul 19 '24 00:07 davidwallacejackson

buildkit buildkit copied to clipboard

Proposal: Allow setting cache expiration time for build steps

buildkit
buildkit copied to clipboard