dgit icon indicating copy to clipboard operation
dgit copied to clipboard

very slow and memory taxing clone of a big repo

Open ghost opened this issue 7 years ago • 4 comments

This is on 9front, go 1.11. Cloning small/medium size repos works fine; cloning big(ger) repos is impossible to see through completion due to time and space limitations of my digital computer:

; dgit clone https://github.com/golang/go
...
Indexing objects: 20% (71561/356126)
; ps | grep dgit
glenda         8739    0:12   0:39  1437232K Pread    dgit

It took half an hour to get to the above state, after which I killed the process to avoid running out of memory.

For comparison, same network, similar machine, running OpenBSD, with standard git:

$ time git clone https://github.com/golang/go
...
    1m38.04s real     0m32.31s user     0m09.08s system

and with dgit:

$ time dgit clone https://github.com/golang/go
...
Indexing objects: 23% (81037/356126)
fatal error: runtime: out of memory
... [ stack dump ] ...
    2m48.82s real    0m51.92s user    0m41.69s system

ghost avatar Sep 22 '18 14:09 ghost

How much RAM are you working with? dgit indexing is slow for me, but I've never had it run out of memory.

I started refactoring things to add support for the git v2 protocol yesterday which should help with bigger repos once it's done, but if you're getting to the indexing stage then you're already past the part where it would have helped.

I think there's two performance issues with the indexing that need to be tackled for this: one is that git's implementation is multithreaded while dgit's isn't and two is that dgit keeps an in memory cache of objects that it's found to help resolve deltas. (The latter is probably the main culprit.)

driusan avatar Sep 23 '18 12:09 driusan

The commit message said partially resolves, not resolves..

driusan avatar Oct 11 '18 00:10 driusan

I was able to clone it after #160, #161 and #163. (It took about 45 minutes, but it ran to completion)

driusan avatar Oct 15 '18 22:10 driusan

The speed should further improved by #267 (it's still more memory intensive than it should be and not at the speed of git/git, but it's getting better.)

driusan avatar May 17 '20 12:05 driusan