GitTorrent icon indicating copy to clipboard operation
GitTorrent copied to clipboard

gittorrent doesn't build a "diff" packfile

Open rakoo opened this issue 10 years ago • 6 comments

A peer can ask another peer for a given revision (here) but it cannot say "I already have xxxxxx" for efficient packfile generation (and, indeed, the daemon builds pack from scratch). It would be nice for a peer to be able to send its current revision so that the remote can build an efficient diff packfile instead of a full one.

If I understand correctly, this means changing ut_gittorrent

rakoo avatar May 30 '15 09:05 rakoo

Thanks for filing this! Yes, ut_gittorrent should hook up git-fetch-pack to git-upload-pack over the ut_gittorrent transfer stream. I've been working on this but it's not ready yet.

Normally git-upload-pack conducts the negotiation with git-fetch-pack, then sends "PACK", then the pack itself. We'd want it to:

  • conduct the negotiation
  • at PACK, stop sending data over the stream and save it to disk instead
  • create a .torrent for that captured pack
  • tell the client what the infoHash of that torrent is
  • start sending the torrent after that

cjb avatar May 30 '15 12:05 cjb

At first I only thought about extending ut_gittorrent for a client to ask for want and send its list of haves, but you're right, there should be a way to let git-fetch-pack and git-upload-pack negotiate exact content instead of re-implementing it.

rakoo avatar May 30 '15 13:05 rakoo

Counter-argument regarding determinicity as mentioned in the HN thread: having a fetcher send the same want and have to everyone and each sender compute a pack object with those should be more deterministic than having fetcher and each sender negotiate every time

rakoo avatar May 30 '15 13:05 rakoo

@rakoo Sorry, I'm not sure I understood that last point. Could you elaborate?

cjb avatar May 30 '15 14:05 cjb

If a fetcher sends its want and its haves to all senders (as I was thinking in the first place), it's going to be the same for everyone, the only variable then is how git-pack-objects is implemented; if on the other hand a fetcher negotiates pack content with every sender, I feel there is more chance that the actual pack content changes, because there are more variables. But maybe it's just an illusion.

rakoo avatar May 30 '15 14:05 rakoo

Yeah, I don't think there's actually a difference between those two in the resulting packfile, but might be wrong. The only input the sender has to the negotiation is what commit they're at, which isn't changing.

cjb avatar May 30 '15 14:05 cjb