libgit2
libgit2 copied to clipboard
Cloning large repo in libgit2 slower than `git clone`
I didn't see this when I briefly checked the list of issues, so sorry if it's already been reported. Both libgit2 and git clone
spend roughly the same amount of time downloading (around 30 seconds or so), but the "Resolving deltas" stage is much slower using libgit2.
Reproduction steps
I just cloned git://sourceware.org/git/glibc.git
using the example code in the repository.
Expected behavior
Both libgit2 and git clone
take roughly the same amount of time.
Actual behavior
libgit2 takes around 4 minutes, whereas git clone
takes about 1.5 minutes.
Version of libgit2 (release number or SHA1)
0.27.0 (I've also tested whatever the Rust crate git2
downloads by default, and the speed was about the same).
Operating system(s) tested
Linux alex-linux 4.16.13-2-ARCH #1 SMP PREEMPT Fri Jun 1 18:46:11 UTC 2018 x86_64 GNU/Linux
+1 on this
+1 clone
is very, very slow compared to the offical git
binary. After the cli git
client received all objects the Resolving deltas
and Checkout
phase is relatively quick. libgit
s transfer progress callback shows that indexed objects and deltas take about 1 second per 20.
What I found helps a little is setting opts.checkout_opts.disable_filters = 1
, but it is still magnitues slower than a git clone
.
Disabling GIT_OPT_ENABLE_STRICT_HASH_VERIFICATION
helped also a bit, but the clone time is still a long way from fast.
What also helps is chose the appropriate SHA1 backend when building. I found this rather undocumented flag in the CMake files SHA1_BACKEND
.
If i complie with (on macOS/iOS)
-DSHA1_BACKEND=CommonCrypto
it's already quite a bit faster than the default SHA1 c implementation, so maybe set this to a value appropriate for your system.
@eaigner This will disable the only SHAttered-enabled hashing backend we have (AFAIK CommonCrypto doesn't detect it), hence YMMV.
So how else can I improve clone performance? Like I said with default settings its not really usable and more than 10 or 100x slower than a git clone
in resolving deltas.
More data points:
cgit2 clone git 24.70s user 0.98s system 79% cpu 32.178 total
git clone git 18.82s user 1.75s system 124% cpu 16.577 total
cgit2 clone linux 491.47s user 15.81s system 88% cpu 9:30.34 total
git clone linux 343.69s user 36.13s system 151% cpu 4:09.93 total
So it is about half the performance (on my machine). Note that I was testing 0.27.7 (the version on Debian Buster).
I'm assuming this has to do with the fact that native git uses multiple threads to resolve deltas while I can't find any info on whether libgit2 does the same, which leads me to believe it doesn't.
Judging from the reported CPU-usage (>100% for git, <100% for cgit2), this is likely one contributing factor.
However, the "user" time (which I assume is the sum of the times spend in user-space over all threads) is also higher for cgit2, implying some potential for speed-ups. If the measurement covers all the processing done by git (e.g. does not omit some child process), that is.
Agreed, there is probably some lower-hanging fruit here beforer going multitheraded. I've run git clones with threrading disabled and they don't take as long as libgit2.
+1 this issue. I am also experiencing slowdowns due to this issue.
Can it be that "git clone" command support parallel job where as libgit2 may not use parallel jobs? On big repo I am seeing why more than 1 mins delay, sometimes more like over 10 mins slower using libgit2
i am also expieriencing this issue as well.
Yep, apologies but I've been making incremental progress on this in a branch.