apko icon indicating copy to clipboard operation
apko copied to clipboard

fix: prevent cache collisions when URLs return duplicate ETags

Open javacruft opened this issue 1 month ago • 4 comments

Fixes #1944 where multi-arch builds failed because Alpine Linux returns the same ETag for different signing keys. The cache used only ETags as keys, causing different files to collide.

Changes:

  • Add URL hash to cache key: etag + sha256(url)[:4]
  • Each unique URL gets unique cache entry regardless of ETag
  • Add integration test simulating Alpine Linux duplicate ETags
  • Add concurrency test for multi-arch build scenario
  • Add backward compatibility test documenting cache format change
  • Simplify and remove low-value unit tests

Test results:

  • All existing tests pass
  • Integration test verifies end-to-end cache behavior
  • Concurrency test validates multi-arch scenario with goroutines

javacruft avatar Nov 21 '25 15:11 javacruft

we write out keys inside the built image; thus we might as well store the keys by filename, as we have an existing bug that we cannot use two different keys with the same filename but different full-url / prefix.

Unless we want to parse the key and store it by like it's RSA public key properties (or full sha256?) and solve the bug of distinct keys under the same name.

xnox avatar Nov 22 '25 23:11 xnox

we write out keys inside the built image; thus we might as well store the keys by filename, as we have an existing bug that we cannot use two different keys with the same filename but different full-url / prefix.

Unless we want to parse the key and store it by like it's RSA public key properties (or full sha256?) and solve the bug of distinct keys under the same name.

The short hash is calculated using the full URL so I think this should avoid same filename collisions.

javacruft avatar Nov 25 '25 11:11 javacruft

If we structure the cache by URL already, and want to use URL subdirs, is it simply the case we want to use full URLs without using etags as the leaf masquerading for the filename?

As in, I think it is more user friendly to simply use the filename from the URL, than truncated sha256 of the very same filename.

xnox avatar Dec 06 '25 19:12 xnox

I like the etags collision tests - in case alpine stops having key collision.

I don't like the sha256 URL suffix. Looks odd.

xnox avatar Dec 06 '25 19:12 xnox