git-resource
git-resource copied to clipboard
Clone submodules in parallel
There is no reason this needs to be done serially. You could add something like this:
git submodule status | awk '{print $2}' | xargs -P5 -n1 git submodule update --init $depthflag --recursive
here:
https://github.com/concourse/git-resource/blob/d3fc95733f28b86fd39016573aa47347e44e0238/assets/in#L59-L66
I'll do a PR if you want.
Hi there!
We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.
The current status is as follows:
- [ ] #126416867 Clone submodules in parallel
This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.
This would be neat.
Unfortunately the xargs
in the busybox image doesn't come with the -P
flag. :(
BusyBox v1.23.1 (2015-07-25 17:45:28 UTC) multi-call binary.
Usage: xargs [OPTIONS] [PROG ARGS]
Run PROG on every item given by stdin
-r Don't run command if input is empty
-0 Input is separated by NUL characters
-t Print the command on stderr before execution
-e[STR] STR stops input processing
-n N Pass no more than N args to PROG
-s N Pass command line of no more than N bytes
-I STR Replace STR within PROG ARGS with input line
-x Exit if size is exceeded
This might require a larger change to see if we can get a better version of xargs in that image.
It looks like with git version 2.8, the git
command supports -j
to do jobs in parallel. This can be applied to the recursively get submodules, too.
We just need to bump git
. :-)
As of git
version 2.9, git submodule update
supports a --jobs=N
flag. This will allow submodules to be downloaded in parallel.
I did a little non-scientific experiment with the submodules in cf-release:
- without
--jobs
flag: 8m59.262s - with
--jobs=4
flag: 3m2.291s
This seems like a huge improvement, especially since it would be a single line update in the in
script. I would submit a PR myself, but I have no idea how to get a newer version of git
installed in the image. I tried looking at the buildroot stuff for git, but just got really lost.
The base image now uses Git 2.9, and adding --jobs=X
works as expected.
I've no idea how to test this though, as there's no observable output from adding this. We can't really have a test based on timing, can we?
As an aside, specifying
git_config:
- name: submodule.fetchJobs
value: 10
...should work, although I've not had a chance to time it yet.
@vito What would you find acceptable in terms of testing?
this is one case where the risk of tech debt in the tests themselves outweighs the risk in the production code, so I don't think we should add any.
On Tue, Sep 20, 2016, 7:58 AM Daniel Jones [email protected] wrote:
The base image now uses Git 2.9, and adding --jobs=X works as expected.
I've no idea how to test this though, as there's no observable output from adding this. We can't really have a test based on timing, can we?
As an aside, specifying
git_config: - name: submodule.fetchJobs value: 10
...should work, although I've not had a chance to time it yet.
@vito https://github.com/vito What would you find acceptable in terms of testing?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/concourse/git-resource/issues/25#issuecomment-248327215, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAHWKw5E-a0EM5pdiqS8nyUVd_Uuvhuks5qr_R3gaJpZM4G9Ceg .