build-image
build-image copied to clipboard
Make a "shallow clone" of the repo.
Is your feature request related to a problem? Please describe.
The full git clone can be time consuming and the full history isn't required to make a build of a site.
Describe the solution you'd like
The request is for the build command to only make a "shallow clone" by adding the --depth 1
option to the command.
This might also include explicitly cloning submodules in same way:
git clone --depth 1 --recursive --shallow-submodules
Describe alternatives you've considered
Leaving this as it is.
Additional context
The shallow clone can dramatically reduce both the time and bandwidth required for the clone. With the limits on build time, reducing potential waste in the initial cloning would be a nice improvement.
+1 to adding the --recursive
option. See https://github.com/netlify/build-image/issues/316#issuecomment-511553688 for an example, why this would be necessary.
+1 to recursive: missing this is a pretty big downside to using Netlify if you are in a situation that needs nested submodules.
@overlordofmu any news about this feature? Without it, for Hugo builds one has to copy the theme folder into the repo and that is volume added to the repo for no reason
No, you dont:
command = "git submodule update --init --recursive --depth=1 && hugo --minify"
in your netlify.toml
should do the trick.
@orf Thanks you, it worked like a charm
Going back to the original request, is there a way to specify a shallow clone of the repo?
It is not possible, @mikejurka.
If there was a workaround we normally would mention it in the feature request but no one is perfect and it never hurts to ask. In this case, though, the answer is "no". There is no workaround.
Now, that being said, it does appear for certain workflows there can be workarounds. See @orf's comment above. For the general case of "always clone the primary repo this way" there is no workaround.
@mikejurka @overlordofmu the workaround from @orf has been working for more than one year in two of our production setups: https://reva.link and https://sciencemesh.io
@overlordofmu I think this "workaround" worths to be documented.
It’s bonkers to me that this feature isn’t trivial to add. A short comment with some technical details why would be much appreciated, as this significantly reduces build times for projects with large submodules.
@labkode Unfortunately workaround isn't for shallow cloning of the repo, but only for recursive cloning of submodules. I don't have submodules, and I'm primarily interested in a shallow clone of my main repo to make deploys faster.
I'm still having this issue - I have a repo that grows by a few commits every 20-30 minutes, and deploys. They're getting progressively slower, and it's not cost effective - I will have to move off Netlify if there is no solution.
Repo is ~600mb, and 65k commits.
@ukd1, I would normally expect successive build to start from the build image's cached version of the repo and therefore preventing the steady growth of the build time. On the other hand, if the repo and site are growing and more and more pages/assets are being deployed then it would name sense for the build time to increase linearly with these increased page/asset count.
Now, I do believe it could just be an increase cache by the full clone as stated but I would like to make sure there isn't something else occurring. I tried to find the site in question but I couldn't connect the dots from your GitHub user.
Would you please email [email protected] to open a support ticket about this and tell us which site this is happening for? If so, we'll take a closer look to confirm the root cause.
@overlordofmu it's pre-built, the problem is just that @netlify is pulling the whole repo, not a shallow clone. This ended up costing me ~$30 a month of "build time" which was 99.9% cloning. My last build log:
11:37:16 AM: Build ready to start
11:37:18 AM: build-image version: 081db65c3e4ce8423fedb40e7689a87de6f84667
11:37:18 AM: build-image tag: v4.3.1
11:37:18 AM: buildbot version: b39181dd1c0d3adcb7b5f8e99f3059512848225c
11:37:19 AM: Fetching cached dependencies
11:37:19 AM: Starting to download cache of 413.2MB
11:37:22 AM: Finished downloading cache in 3.59015073s
11:37:22 AM: Starting to extract cache
11:37:29 AM: Finished extracting cache in 7.228423874s
11:37:29 AM: Finished fetching cache in 10.928202078s
11:37:29 AM: Starting to prepare the repo for build
11:37:30 AM: Preparing Git Reference refs/heads/master
11:38:20 AM: Parsing package.json dependencies
11:38:21 AM: No build steps found, continuing to publishing
11:38:21 AM: Starting to deploy site from 'public'
11:38:21 AM: Creating deploy tree
11:38:21 AM: Creating deploy upload records
11:38:21 AM: 25 new files to upload
11:38:21 AM: 0 new functions to upload
11:38:22 AM: Starting post processing
11:38:22 AM: Post processing - HTML
11:38:29 AM: Post processing - header rules
11:38:29 AM: Post processing - redirect rules
11:38:29 AM: Post processing done
11:38:29 AM: Site is live ✨
11:39:32 AM: Finished processing build request in 2m13.547153636s
Between:
11:37:29 AM: Starting to prepare the repo for build
...
11:38:21 AM: No build steps found, continuing to publishing
is nearly a full minute of "git stuff" just hanging there, and that specifically is costing me >$40 a month. Here is a partial month:
Obviously y'all make money from this, but it's also slowing my builds down. When I build on Circle, or Vercel - I can shallow clone if I want and it's way faster.
Either way, it's moot for me now; a) this has been open so long I don't think y'all will fix it b) I've churned and moved to @vercel, which shallow clones (it takes ~3-5 seconds). c) The interesting thing for me is that as this is a side-project I want to be cheap. However, it's made me checkout other tech - which now I may use for business as well.
If y'all fix it, lmk.
cc @dscandalios
I do encourage people to +1 in the comments here if you want to see this changed as people showing interest directly impacts the likelihood of the feature request become a reality.
Sorry to learn you have moved to a competitor and thank you for explaining your workflow, @ukd1. I only wish you would have given us a chance to assist you. If you would have explained your workflow earlier we could have recommended a zero cost solution.
If you are not running a site build command in the build image, then the recommended workflow would have been to do a manual deploy using the Netlify CLI instead. You can integrate the Netlify CLI into GitHub Actions to manually deploy directly from GitHub.
This skips the build process entirely and there are no build minutes calculated for manual deploys. Also, the CLI tool uses our API and only uploads changed files. So, in the example build above, this means that only the twenty-five changed files would have been transferred from the GitHub action to Netlify. I think we can all agree that this would undoubtably be more more efficient than even the shallow clone would be. We would have suggested this sooner if we had this information before today.
About the reasons for the delay on this feature request - the primary factors delaying this, in my view from within Netlify, are:
- relatively low levels of interest in this feature request
- technical debt
So, first, there are relatively few requests for this compared to other features. People are not beating down our door to get this fixed.
Yes, there are interested parties +1-ing here but, relative to other feature requests the interest in this one does not make it to the top the list. It's open. It's tracked. It's in the backlog. It just doesn't appear to be a top priority for the vast majority of people using Netlify today.
Second above is technical debt. The best example I can give is the build ignore command.
The default build.ignore
command used if someone doesn't modify it is this:
git diff --quiet "${COMMIT_REF}" "${CACHED_COMMIT_REF}" .
And guess what doesn't work at all with shallow clones? If you guessed "git diff
doesn't work with shallow clones" then you answered correctly.
So, before we can add a shallow clone feature, it needs to be optional because many people depend on the build ignore command and this shallow clone feature would break an existing feature. Also, all other dependencies (dependencies like the build ignore command) also need new exception handling added to deal with the new edge cases that are introduced by optionally making shallow clones instead.
If we were supporting you and just your one web app, sure, then making this change is "trivial to add" (to quote someone above). Making a hypothetical change in a vacuum is trivial.
However, making a real world change to an existing ecosystem with hundred of thousands of developers, millions of websites, and millions of deploys while simultaneously being certain that the change will also cause zero impact to all those existing workflows, now that is far from trivial. There are many "moving parts" to consider and making this change without impacting anyone else is actually more complex than it might appear at first.
Again, please keep adding +1 comments as this will raise the priority of the feature request.
+1'd the feature request and signaling support for an optional shallow clone feature. Similar to ukd1 we like Netlify but are having long builds (and have similarly seen much better performance on other services).
@overlordofmu I get it - we have many features in the same boat, and have to prioritize. It also sucks to loose a customer.
Also, yes I wish I'd known about your idea/fix sooner; I didn't know that was even an option. I'm not super deep in vercel, but the amount of time to switch was minimal. I'll check it out at some point.
Best of luck.