ko icon indicating copy to clipboard operation
ko copied to clipboard

Question: can we extract KOCACHE instead of saving it to disk to be able use it between stateless builds?

Open developer-guy opened this issue 1 year ago • 5 comments

AFAIK, KOCACHE only accepts directories as a value. But we couldn't use this cache between stateless builds. For example, each workflow run executes in a fresh VM in GitHub Actions.

It'd be nice if we could use OCI registries to save the cache for ko builds.

developer-guy avatar Sep 01 '22 15:09 developer-guy

cc @jonjohnsonjr

imjasonh avatar Sep 06 '22 13:09 imjasonh

kindly ping @jonjohnsonjr

developer-guy avatar Sep 08 '22 13:09 developer-guy

I think we can store this information in the annotations of the image manifest (which might be verbose) or labels of the image config, as BuildKit did, and read that information over here instead of the disk.

AFAIK, ko uses some mapping between the buildIDs to diffIDs and diffIDs to the descriptor, so, we can use the diffID as key and descriptor JSON as a value in the annotations or the labels sections.

WDYT?

developer-guy avatar Oct 06 '22 10:10 developer-guy

Hey sorry, I was on leave for a while 😅

Yes, this is definitely something we could do! I jotted down some notes a while back, let me just dump them here:

KOCACHE_REPO=gcr.io/jonjohnson-test/kocache

Tag = BuildId

Repo acts as map[BuildId]CacheManifest

CacheManifest {
  InlinedConfig
  Layer[0] = OriginalDescriptorBlob
  Layer[1..n] = ForeignDesc # Optional...
  Annotations
}

InlinedConfig gives access to diffids.

Layer[0] preserves the descriptor of the layer we built so it keeps the media type, annotations, etc. (Need some way to invalidate that if preserveMediaType is flipped? Annotation that is a checksum?) Inlined for speed via data field.

Layer[1..n] track where we've pushed this layer before, maybe we keep it in the kocacherepo as well. Annotations for staleness? Use foreign layer to reference other repos/registries in urls.

Annotations contain ko and go version metadata for determining staleness.

Cache repo can be used to store all built images as well so we can easily mount into target registry if possible but just copy if not.

New `ko cache` command for managing this repo (GC, sync, etc) and the dir. Directory is local so much faster to try first and we can cache through to it if both are set. May need a way to shard cacherepo for large number of tags?

I think what you're describing is very similar to what I had in mind here?

jonjohnsonjr avatar Oct 06 '22 16:10 jonjohnsonjr

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Keep fresh with the 'lifecycle/frozen' label.

github-actions[bot] avatar Jan 05 '23 01:01 github-actions[bot]