modus icon indicating copy to clipboard operation
modus copied to clipboard

Caching

Open mechtaev opened this issue 3 years ago • 2 comments

Caching is an important part of Docker and any build system. Docker currently does not provide sufficient tools to control caching. Here are some things we need to think about:

Caching layers

When we execute something like run(f"git clone ${URL}"), the layer will be reused as long as ${URL} does not change, even if the git repository gets updated. In principle, we can add something like run(f"git clone ${URL}")::no_cache, but then the cache will be invalidated every time, even if the repository is not updated. BuildKit provides an ad-hoc support for git, but this will not work for other VCS and any other scenarios. So, it would be nice to have a more fine-grained control over caching of layers.

Caching images

In Docker, the registry of images and the build cache are separate things. Thus, it is unclear what our "minimal build tree" means w.r.t. caching, i.e. do we compute the minimal tree w.r.t the images in the database, or w.r.t. to the cached data?

Ideally, we should pick a principled and intuitive approach that is also not too far from what Docker and BuildKit do, so that it was realistic to implement and maintain.

@barr , @maowtm , @thevirtuoso1973 , any thoughts?

mechtaev avatar Nov 20 '21 19:11 mechtaev

I think we could define some common interface between the local build cache and container registry. The registry would follow this spec. There is also probably some OCI spec on how to query the local build cache.

Although, even before that, I'm a bit unclear on how the solver should behave given this "abstract cache of available images". For example, in this case, perhaps after it has built the SLD tree, it will not use the (sub)tree that contains from("myregistry.domain.com/app:1.1-dev"). if that tag is not available.

thevirtuoso1973 avatar Nov 27 '21 13:11 thevirtuoso1973

it will not use the (sub)tree that contains from("myregistry.domain.com/app:1.1-dev"). if that tag is not available

So, the question is if any information from cache/registry will be used as a part of the cost function for finding proofs. Maybe we can provide several cost functions that the user will be able to choose when executing build.

mechtaev avatar Nov 28 '21 22:11 mechtaev