Concurrent requests causing a slowdown in processing clone requests
Hi,
Our team is in a similar setup as this post https://github.com/jonasmalacofilho/git-cache-http-server/issues/5#issuecomment-336875780
We are running our CI using gitlab runner and we need to shave down the time it takes to clone our somewhat large repository. Every single job in our CI currently needs to clone the same repo, and we were hoping to use git-cache-http-server to help cut down this clone time. One issue we noticed though is when there are multiple jobs slamming the server at the same time trying to request for the same repo clone is that the time it takes to complete the download increases linearly to the number of requests that come in at the same time.
So for example, I ran a job that only submits 1 clone request. It completed in ~45s. I ran a job that submits 6 clone requests all at once, and all the clones completed in roughly ~2m 40s. I ran a job that submits 9 clone requests all at once, and all the clones completed in roughly ~3m 40s.
Would you happen to have any ideas on what I can look into to resolve this issue?
Maybe look into what happens and where exactly time is being spent in git-cache-http-server, in each scenario. (But by now you're probably already doing that).
If it helps, I have a feeling that what might be missing is a proper queue for incoming requests to the same repository, accompanied by a clear policy of when to (not) update the cache.
Implementing that, and more generally making this project more robust and easier to maintain, are all things I've been wanting do do for a while, but other OSS projects have had higher priority for me.
That said, on the off chance you (or some other team using git-cache-http-server) have the budget for it, I would be interested in at least discussing a sponsorship to rewrite some of it, fix pending issues, and implement a few new features and improvements.