fix: prevent cache collisions when URLs return duplicate ETags
Fixes #1944 where multi-arch builds failed because Alpine Linux returns the same ETag for different signing keys. The cache used only ETags as keys, causing different files to collide.
Changes:
- Add URL hash to cache key: etag + sha256(url)[:4]
- Each unique URL gets unique cache entry regardless of ETag
- Add integration test simulating Alpine Linux duplicate ETags
- Add concurrency test for multi-arch build scenario
- Add backward compatibility test documenting cache format change
- Simplify and remove low-value unit tests
Test results:
- All existing tests pass
- Integration test verifies end-to-end cache behavior
- Concurrency test validates multi-arch scenario with goroutines
we write out keys inside the built image; thus we might as well store the keys by filename, as we have an existing bug that we cannot use two different keys with the same filename but different full-url / prefix.
Unless we want to parse the key and store it by like it's RSA public key properties (or full sha256?) and solve the bug of distinct keys under the same name.
we write out keys inside the built image; thus we might as well store the keys by filename, as we have an existing bug that we cannot use two different keys with the same filename but different full-url / prefix.
Unless we want to parse the key and store it by like it's RSA public key properties (or full sha256?) and solve the bug of distinct keys under the same name.
The short hash is calculated using the full URL so I think this should avoid same filename collisions.
If we structure the cache by URL already, and want to use URL subdirs, is it simply the case we want to use full URLs without using etags as the leaf masquerading for the filename?
As in, I think it is more user friendly to simply use the filename from the URL, than truncated sha256 of the very same filename.
I like the etags collision tests - in case alpine stops having key collision.
I don't like the sha256 URL suffix. Looks odd.