checkout How to optimise checkout of a monorepo?

For a monorepo of size significantly greater than 1GB, cloning or fetching the repository can be time intensive and degrade the performance of a job.

In a self-hosted runner, one way to remedy the situation is to have a pre-provisioned repository in the runner. That way, improvement can be gained...

Using `git clone ...`

The are two methods for optimising git clone ...

--reference[-if-able] <repository>. git will use the local cache first if available and then reach out to upstream remote for newer objects that are not there.
git config --global url."file:///local/repository".insteadOf https://github.com/remote/repository.git. This can be used for cloning and but not for (fetch nor) push.

Question❓

It is currently difficult or impossible to leverage the above measures with actions/checkout. For the following reasons

Rather than git clone ..., git init is used. Hence, the --reference[-if-able] <repository> option is not supported by actions/checkout. Is there any reason why the git init flow was preferred above git clone ...❓ Or can we switch to git clone ... and then support the --reference[-if-able] <repository> option?
Would it help if I pointed actions/checkout to an existing git repository?

Nov 27 '23 08:11 igwejk

On the second question. I suppose it won't help to point to a directory with an existing repository due to the following aspect of the current implementation.

Nov 27 '23 08:11 igwejk

same for my 7gb repo 😢 the only option is to simply use a CLI command steps instead of checkout action, I wonder if there will be support for using existing pre-cached repo in the runner, instead of a fresh new init and clone? 🙏

Jan 02 '24 21:01 Dmitry1987

How to optimise checkout of a monorepo?

Using git clone ...

Question❓

Using `git clone ...`