exist
exist copied to clipboard
[BUG] Too many caches in GitHub Actions CI
At present we have a number of different caches used across several GitHub Actions workflows. In terms of cache size, the most significant one is that for caching Maven dependencies (~/.m2/repository).
The caching of Maven repositories is performed for by these workflows:
-
ci-deploy.yml* Target: Themasteranddevelopbranches, and any PR's opened against them.- Jobs:
- Build and Test Images
- Cache key:
${{ runner.os }}-m2-${{ hashFiles('**/pom.xml') }} - Restore key:
${{ runner.os }}-m2
- Jobs:
-
ci-test.yml* Target: All branches, and all PRs (ignoring the Code Coverage step which only executes for thedevelopbranch).- Jobs:
- License check
- Dependency checks
- Javadoc
- Test
- Cache key:
setup-java-${{ platform }}-maven-${{ hashFiles('**/pom.xml') }} - Restore key:
setup-java-${{ platform }}-maven-${{ hashFiles('**/pom.xml') }}
- Jobs:
-
ci-xqts.yml* Target: All branches, and all PRs.- Jobs:
- W3C XQuery Test Suite
- Cache key:
setup-java-${{ platform }}-maven-${{ hashFiles('**/pom.xml') }} - Restore key:
setup-java-${{ platform }}-maven-${{ hashFiles('**/pom.xml') }}
- Jobs:
-
sonarcloud.yml* Target: Themasteranddevelopbranches, and all PRs.- Jobs:
- SonarCloud Analysis
- Cache key:
${{ runner.os }}-m2-${{ hashFiles('**/pom.xml') }} - Restore key:
${{ runner.os }}-m2
- Jobs:
Problems with this approach
-
The
ci-deploy.yml, and thesonarcloud.ymlworkflows use the same cache key, which means that a PR that changes anything in anypom.xmlwill cause a replacement of that cache by both workflows. This in turn leads to a race-condition - whichever workflow runs last will win (i.e. have its cache persisted). The problem is that each workflow will produce a different set of Maven dependencies (i.e. different set of cacheable dependencies), as they execute different parts of the build. Therefore each time the cache is retrieved, if apom.xmlfile was changed, then the cache may be invalid to both workflows and thus pointless; it may actually slow down the build! -
The
ci-test.ymland theci-xqts.ymlalso share a cache key, but will generate different caches. They have the same problem discussed above in (1). -
We have many PRs that come through that generate a change to a single dependency in a single
pom.xml. These are autogenerated by @dependabot-bot. Using @dependabot-bot is great, but due to how we have configured our Caches for Maven (i.e. theactions/setup-java@v3GitHub Action withcache: 'maveninci-test.yml), each @dependabot-bot will create a new Cache. -
Each of our caches for Maven Dependencies is ~360MB at present, GitHub limits the amount of cache space for a project to a generous 10GB, but we frequently hit this ceiling.

Insights
-
PRs generated for @dependabot-bot, will each create a Cache that can only be used by that PR, and then if merged: subsequent builds of the
developbranch, and any later PRs todevelop(that don't change anypom.xml). Each of these Caches is expensive in terms of disk-space (~360MB at present). -
The space available for Caches is limited to 10GB, each cache joins a queue for eviction when the 10GB disk-space is exceeded. This queue is FIFO based and not an LRU. This means that if a great deal of the total 10GB of available cache space is taken up with @dependabot-bot caches (that may only be used twice), and is interspersed with caches that are used more frequently, then these @dependabot-bot caches may cause more frequently used caches to be invalidated sooner. A future workflow that would have used that "hotter" cache, will have to re-download these dependencies and create a new Cache; thus slowing down that workflow.
-
The cache keys used for the Maven Caches include
${{ platform }}component. However as this is Java, there are very little in the way of platform specific dependencies, and so this could be removed which would reduce the number of caches by 3x (our platforms: Linux, macOS, Windows). -
The majority of changes to the
pom.xmlfiles come from Maven dependency updates courtesy of @dependabot-bot PRs. -
The majority of non-dependabot PRs do not change the
pom.xmlfiles. -
The majority of PRs against a branch, make either zero or one change (e.g. @dependabot-bot) to the dependencies in a
pom.xmlfile. If that PR could reuse a cache generated for the base-branch then that cache would already contain > 99% of the required dependencies. -
The cache for a branch need only be replaced after that branch has a change (PR) merged that adds/updates/changes a dependency (i.e. pom.xml) file.
-
Instead of using the
cache: mavenproperty of the GitHub Actionactions/setup-java@v3which will cache and match on exact cache keys, we can instead directly use thekeyandrestore-keysproperties of theactions/cache@v3GitHub Action to potentially read and write from different caches. -
The
actions/cache@v3GitHub Action provides a granularity of configuration that allows us to have some workflow jobs that only read from the cache, whilst others may write to the cache, or both.
Proposed Solution
By considering:
- The lifetime of a Cache of Maven Dependencies
- The clients (workflows) that need to work with them
- The isolation of dependencies between those clients
- If each client needs read and/or write access to the cache
I propose the following policy for our GitHub CI Maven caches:
- One Maven Cache per Git branch.
Due to our development model, of "fork and send a PR", we really only have 5 active base-branches:
master, develop, develop-6.x.x, develop-5.x.x, develop-4.x.x. It is likely that the number of active branches will decrease in the near future as we stop providing bugfixes to older version of our product. - PRs do not create Caches. A PR is taken from a base-branch, when the GitHub Actions workflows execute for a PR, it can read the Cache for a base-branch. That cache will either be 100% complete already, or if the PR changes a few dependencies, it will be very close to 100% complete. A GitHub workflow for a PR never writes a Cache for Maven dependencies.
- Maven Caches created by Merge of PR.
A GitHub Actions Workflow that runs after a PR has been merged to a base-branch may create a new Maven Cache. The Cache key must be named based on the base-branch name, so it may be used by subsequent workflows on that branch and new PRs against that branch. For example:
- Cache key:
maven-cache-${{github.base_ref}}-${{ hashFiles('**/pom.xml') }} - Restore keys:
maven-cache-${{github.base_ref}}-maven-cache-
- Cache key:
NOTE: That the Restore keys are only used as a fallback if the cache cannot be restored by matching exactly on the Cache key first.
NOTE: It is tempting to think about removing the ${{github.base_ref}} from the Cache Key. The issue with doing that is that each of our active branches represent major differences in the product (i.e. huge dependency changes), if we were to cache and restore against just the key maven-cache-${{github.base_ref}}, we would have very little chance of taking the cache that is most correct for the major product version.
Addendum
We are also making use of the OWASP dependency-check-maven plugin in our Dependency checks job of our ci-test.yml workflow.
This plugin frequently downloads a CVE database and stores it in the folder ~/.m2/repository/org/owasp/dependency-check-data/. As our ~/.m2/repository is the basis of each of our Maven Caches. This presents some issues in so far as the Maven Cache is not invalidated (as it is only invalidated by changes to pom.xml files) and replaced when the CVE database changes; therefore on each workflow run, the same more recent CVE database needs to be downloaded again and again.
Therefore we should configure the dependency-check-maven plugin via its dataDirectory configuration property to store its data outside of the ~/.m2 repository, and we should create a separate GitHub Actions Cache just for this purpose. That cache can be safely read/written by each job.
NOTE: However at this time it is not clear how we can detect change of the CVE databases, to use in the cache-key. perhaps we have to use a date/time-stamp in the cache key which is based on the cveValidForHours configuration value of the dependency-check-maven plugin?