actions icon indicating copy to clipboard operation
actions copied to clipboard

Make the cache more Git aware

Open mumrah opened this issue 11 months ago • 0 comments

Today, we have the following cache entry resolution:

  1. An exact match on OS, job id, workflow name, matrix, and Git SHA
  2. The most recent entry saved for the same OS, job id, workflow name, and matrix values
  3. The most recent entry saved for the same OS and job id
  4. The most recent entry saved for the same OS

It would be useful to include an additional SHA-based match case between cases 1 and 2. This would match based on the most recent commit in the PR that had a cache entry.


Scenario: Three commits A, B, and C exist on main. Cache entries exist for each commit.

C (HEAD) - B - A

Now, consider a PR that has only merged commits up B. It will miss the cache case 1 (exact SHA) and fall through to case 2 (job + matrix). This would likely load the cache produced by C which could be sub-optimal for the PR build. If setup-gradle could load the cache from commit B instead, it could lead to more FROM-CACHE hits in the PR build.

As a human observer its easy to see commit B would be a better cache to load since we can see commit B in the PR. Similarly, we can see that C would be a bad cache to load since commit C is not in the PR. I'm sure automating this is not so easy.

This might be challenging due to the GH cache API, but it would be good to explore this.

mumrah avatar Feb 18 '25 20:02 mumrah