checkout icon indicating copy to clipboard operation
checkout copied to clipboard

How to optimise checkout of a monorepo?

Open igwejk opened this issue 2 years ago • 2 comments

For a monorepo of size significantly greater than 1GB, cloning or fetching the repository can be time intensive and degrade the performance of a job.

In a self-hosted runner, one way to remedy the situation is to have a pre-provisioned repository in the runner. That way, improvement can be gained...

Using git clone ...

The are two methods for optimising git clone ...

  • --reference[-if-able] <repository>. git will use the local cache first if available and then reach out to upstream remote for newer objects that are not there.
  • git config --global url."file:///local/repository".insteadOf https://github.com/remote/repository.git. This can be used for cloning and but not for (fetch nor) push.

Question❓

It is currently difficult or impossible to leverage the above measures with actions/checkout. For the following reasons

  1. Rather than git clone ..., git init is used. Hence, the --reference[-if-able] <repository> option is not supported by actions/checkout. Is there any reason why the git init flow was preferred above git clone ...❓ Or can we switch to git clone ... and then support the --reference[-if-able] <repository> option?
  2. Would it help if I pointed actions/checkout to an existing git repository?

igwejk avatar Nov 27 '23 08:11 igwejk

On the second question. I suppose it won't help to point to a directory with an existing repository due to the following aspect of the current implementation.

Image

igwejk avatar Nov 27 '23 08:11 igwejk

same for my 7gb repo 😢 the only option is to simply use a CLI command steps instead of checkout action, I wonder if there will be support for using existing pre-cached repo in the runner, instead of a fresh new init and clone? 🙏

Dmitry1987 avatar Jan 02 '24 21:01 Dmitry1987