git-cinnabar icon indicating copy to clipboard operation
git-cinnabar copied to clipboard

Incremental clone progress in case of transient network failure?

Open riastradh opened this issue 2 years ago • 8 comments

$ git clone hg::https://anonhg.NetBSD.org/src
Cloning into 'src'...
Getting clone bundle from https://cdn.netbsd.org/_bundles/src-public/6342479a75dfd41f7870674cd444bc697e675aaf.zstd-22.hg
Reading 359330 changesets
Reading and importing 359225 manifests
Reading and importing 2622719 revisions of 475056 files
Importing 359330 changesets
fatal: called `Result::unwrap()` on an `Err` value: "unable to access 'https://anonhg.netbsd.org/src': The requested URL returned error: 500"
Run the command again with `git -c cinnabar.check=traceback <command>` to see the full traceback.
error: git-remote-hg died of signal 6
fatal: could not read ref refs/cinnabar/refs/heads/branches/ADO/tip
$ ls
git-remote-hg.core

I assume that I can't just do something like

mkdir src
cd src
git remote add hg::https://anonhg.NetBSD.org/xsrc

and then run git fetch --depth=N in a loop with increasing N, because git-cinnabar has to start from the root of the DAG to create git hashes all the way to the leaves.

Is there a way to make git-cinnabar save its incremental progress in the event of a transient network failure so it can pick up where it left off?

riastradh avatar Oct 06 '23 08:10 riastradh

Perhaps there's something else going on here too -- twice in a row (once with git clone, once with git fetch --depth=10000), git-cinnabar failed with fatal: could not read ref refs/cinnabar/refs/heads/branches/ADO/tip. So maybe it's not just a transient network error.

Is there a way to dump exactly the request that failed so I can match it up to logs on the server side?

riastradh avatar Oct 06 '23 10:10 riastradh

You can set GIT_CURL_VERBOSE=1

glandium avatar Oct 06 '23 20:10 glandium

As for the original question, it's unfortunately not possible. What happens is that git asks git-cinnabar to grab a bunch of refs corresponding to the repo tips, and git-cinnabar has to create those refs. If it doesn't, git will make the clone will fail (which is what happens in your case).

What you can do, however, is clone the bundle and update with the real repo.

mkdir src
cd src
git init
git remote add origin hg::https://anonhg.NetBSD.org/src
git cinnabar unbundle --clonebundle hg::https://anonhg.NetBSD.org/src
git remote update

I guess there could be a git cinnabar clone subcommand that would provide a more resilient cloning experience.

glandium avatar Oct 06 '23 20:10 glandium

With this patch the clone can go a little further but now fails with an HTTP error 503 on the getbundle command.

--- a/src/hg_connect.rs
+++ b/src/hg_connect.rs
@@ -634,7 +634,7 @@ fn take_sample<R: rand::Rng + ?Sized, T, const SIZE: usize>(
     }
 }
 
-const SAMPLE_SIZE: usize = 100;
+const SAMPLE_SIZE: usize = 50;
 
 #[derive(Default, Debug)]
 struct FindCommonInfo {

glandium avatar Oct 06 '23 20:10 glandium

Mercurial hits the same kind of problem if you use a version older than 3.8, or if you patch a current version to not support the httppostargs capability.

So essentially, git-cinnabar is lacking support for arguments in HTTP POST requests.

glandium avatar Oct 06 '23 21:10 glandium

I need to do more testing, but I have a proof-of-concept implementation of HTTP POST requests on the try branch. I'm able to initiate the part of the clone that failed with this, although I haven't made it go through to the end yet (If it doesn't go through, that would be a separate issue).

I also haven't decided yet whether I'll land this in a 0.6.x release (master branch) or 0.7.0 (next branch).

glandium avatar Oct 07 '23 09:10 glandium

What you can do, however, is clone the bundle and update with the real repo.

mkdir src
cd src
git init
git remote add origin hg::https://anonhg.NetBSD.org/src
git cinnabar unbundle --clonebundle hg::https://anonhg.NetBSD.org/src
git remote update

I guess there could be a git cinnabar clone subcommand that would provide a more resilient cloning experience.

I tried interrupting it partway through, during Reading and importing 175685 manifests, and the only progress it appeared to make was a hundred-megabyte .git/objects/pack/tmp_pack_* file. Can it actually take advantage of that to pick up where I left off? It looks like the same as what happens when I interrupt git fetch --depth=N and it didn't appear to speed anything up.

riastradh avatar Oct 07 '23 23:10 riastradh

I need to do more testing, but I have a proof-of-concept implementation of HTTP POST requests on the try branch. I'm able to initiate the part of the clone that failed with this, although I haven't made it go through to the end yet (If it doesn't go through, that would be a separate issue).

On the try branch at revision 230cce330ea7a566e6ae3c00ae8832917733dec4, this is now able to clone hg::https://anonhg.NetBSD.org/src (or at least, both git fetch && git checkout -b branches/trunk/tip origin/branches/trunk/tip and git cinnabar unbundle --clonebundle ... && git remote update && git checkout -b branches/trunk/tip origin/branches/trunk/tip work, didn't try exactly git clone and both of these took a while), thanks!

That said, the issue here about incremental progress for clone (or fetch or unbundle) remains -- it looks like there's no amount of partial progress that can be saved in the event of interruption in any of these paths?

riastradh avatar Oct 08 '23 01:10 riastradh