git-imerge Autofilling is slow

In https://github.com/jyasskin/chromium, I'm trying to rebase-with-history service-worker-apps (e49557a06ee349daee3edbac71b565ff315509cd) onto lkgr^ (ac5e723c9b830fecd3d13c65b66b0e0fd36e51f9). This gave the following output:

$ git imerge start --first-parent --goal=rebase-with-history --name service-worker-apps lkgr^
Attempting automerge of 1-1...success.
Attempting automerge of 1-1789...success.
Attempting automerge of 1-2683...failure.
Attempting automerge of 1-2236...failure.
Attempting automerge of 1-2013...failure.
Attempting automerge of 1-1901...success.
Attempting automerge of 1-1957...failure.
Attempting automerge of 1-1929...success.
Attempting automerge of 1-1943...success.
Attempting automerge of 1-1950...failure.
Attempting automerge of 1-1947...success.
Attempting automerge of 1-1949...failure.
Attempting automerge of 1-1948...success.
Autofilling 1-1...success.
Autofilling 1-2...success.
...

Each of the Autofilling lines takes about a second to display, so this rebase is going to take about an hour, which is too long to do interactively. Is this an intrinsic part of the imerge process, or is it optimizable?

May 29 '14 20:05 jyasskin

The "autofilling" steps correspond to merging single commits into a checked-out version of the working tree. The merges themselves should be pretty trivial, assuming your project doesn't change the world in each commit. So if these "autofills" take a second each, then it is probably because your source tree is so large that even trivial operations like "git status" take a long time. Am I guessing correctly?

It is possible to do such merges directly in the index, which I would expect would speed them up considerably. It hasn't been a pain point for me yet so I haven't implemented it, but I think it should be pretty straightforward. I think it would involve the commands git read-tree -i -m --aggressive, git merge-index git-merge-one-file -a, git write-tree, and git commit-tree, but the trick is getting the details right of creating a temporary index (maybe not needed?), handling temporary files, and restoring everything to a sane state when done.

It would be helpful to a lot of people if you want to work on this.

May 30 '14 16:05 mhagger

Yes, time git status says about 0.5s of wall time.

I don't expect to get a chance to learn git's internals well enough to fix this in the near future, but thanks for pointing out the path forward.

May 31 '14 20:05 jyasskin

A somewhat useless +1 here -- I gave this a shot to rebase a large patchset on Gecko that was several months out of date. It worked well, rebasing the branch properly with about 11 conflicts. I ran all the steps with timing, e.g. time git imerge continue, and totaled the result at 7 hours 28 minutes of run time for the 13 commands. This was for a branch that was quite diverged:

Your branch and 'central' have diverged, and have 8 and 19901 different commits each, respectively.

This was with a tree in tmpfs, and the process was pegged at 100% CPU, so it doesn't seem IO bound. git status takes +/- 300ms

Aug 11 '14 22:08 Nephyrin

I've tried to implement this, and i've pushed my WIP at https://github.com/MrHacky/git-imerge/tree/wip-notree-automerge

One of the problems i ran into is that merge-index/git-merge-one-file has no option to not touch the worktree. I've created a git-imerge-one-file helper for this, but i'm not sure i'm handling all the needed cases. (also you currently have to install git-imerge-one-file manually in a place where git-merge-index can find it)

I'm not sure when i have time to work on this again, but the next step was going to be redoing one of the large merges in this topic and see if it is any faster and/or has the same result. If someone else wants to try please post your results back here.

Aug 15 '14 07:08 MrHacky

I've tried to implement this, and i've pushed my WIP at https://github.com/MrHacky/git-imerge/tree/wip-notree-automerge

Sorry for the delayed response -- I tested this and it does seem to make a large difference, nearly 2x on the gecko repository with an SSD:

Stock

real    23m56.903s
user    18m40.026s
sys     5m1.533s

MrHacky wip-notree-automerge

real    12m26.360s
user    10m37.810s
sys     1m38.653s

To reproduce the merge I used to test, clone and checkout: https://github.com/Nephyrin/mozilla-dev/tree/imerge-test-rebase-on-a980328 and then time git imerge rebase a980328. The time above is just the time until it presents the first conflict, which involves a few hundred autofills.

With both versions it seems most of the process is CPU bound, at least on linux with an SSD. I wonder how difficult parallelizing autofilling with multiple indexes would be

Sep 03 '14 19:09 Nephyrin

Parallelising would be a huge win, if possible. Doing merges from LLVM (time git status takes about 0.2 seconds of real time), there are 4000 upstream revisions that I'm trying to merge and at about one second per autofill it takes a very long time (several minutes for each one of my revisions), yet only uses one core.

Oct 27 '15 11:10 davidchisnall

I doubt that parallelizing the autofilling of a single block would bring more than a factor of two in speed, and even that would usually not be attained. Why?

Autofilling (or in general the actions that happen when you type git imerge continue) involves two things:

A series of binary searches to find the frontier. A single binary search is not very parallelizable, because you don't know which side to bisect next until you know the result of the current bisection. I suppose you could bisect both sides, basically doing three tests at a time (or more generally n tests at a time) to divide the range into n+1 subranges. But that costs n times as much work and only saves you lg((n+1)/2) bisection steps. So for example it would take 3x as much work at each step to half the bisection time, or 7x as much work to quarter it. So the benefit doesn't scale very well.

In fact, we are really doing a kind of 2D bisection, and it might be possible to gain another factor of at most 2 by working on both ends of the merge frontier simultaneously. (See this blog post for a description of the algorithm.)

But I don't think the bisection step is usually the bottleneck anyway, because it is typically a few times O(lg max(M,N))
A series of merges to fill in the commits along the frontier. Each edge kindof has to be done one commit at a time, because each commit has its predecessor as parent. You could do the two edges of a block simultaneously then check that the apex commits agree, would gain you a factor of (M + N) / max(M,N), where M and N are the lengths of the two edges of the block. In the best case, this would be a factor of two. But more typically, the blocks are long and skinny, so (M + N) is not much larger than max(M,N).

You could compute an arbitrary number of edge merges simultaneously if you assume that merging two distant commits gives the same result as merging incrementally. You would actually compute the trees in parallel, then at the end create commits from the trees, stringing them together in the correct parent-child relationships.

But the assumption in the previous paragraph makes me a little bit nervous. It should usually be true (after all, it is the basis of the bisection algorithm that we use). But in practice it could easily be that there are differences. We currently check for these differences by doing the incremental merges along each edge then comparing the trees of the apex commits to make sure they are identical. But if we used this parallelization approach we wouldn't have nearly as strong a consistency check that the merges had mutually-consistent results.

Another place that could be parallelized would be when there are multiple blocks that need autofilling at the same time. Usually when there is a conflict, there are two blocks that need autofilling. There might be another factor (of at most two) to be gained by filling the two blocks at the same time. But this is also a bit awkward, because one of the blocks sits on top of the other, like this:

**********
*..|.....|
*..|-----+
*..|#?????
*..|??????
*..|??????
*--+??????

So the block on the upper-right can't be autofilled until at least the first two commits on the vertical edge of the block on the left are computed. Also, if there is a conflict at the apex of the block on the left, then those two commits are called into question (though this is hopefully a rare occurrence).

So all in all I don't think autofilling can be parallized very well (I guess typically by a factor of less than two), unless you are willing to rely for correctness on the assumption written in italics above.

Oct 30 '15 09:10 mhagger

Another possible form of parallelism would be to continue autofilling in the background while the user is manually resolving conflicts. In the super slow scenarios described above, how long do the manual merges typically take in comparison to the autofills?

Oct 30 '15 09:10 mhagger

Doing more speculative fills in the background would definitely help if possible, but my understanding is that the process currently only stops because there's a conflict.

That said, my last merge took around 4 days, with git-imerge using one core of a quad-core i7 solidly for most of this time and only about two hours being spent doing anything that involved human interaction. The final step, which autofilled from the last conflict to the merge target, was an overnight job.

I wonder if another option is to lazily fill some of the autofilled parts. Are they all actually used? Can the required ones be created on demand?

Oct 30 '15 09:10 davidchisnall

For the particular case when the rectangle is very long and narrow, I think we could get far better gains by changing filling strategy. For example, in the diagram below suppose that we are trying to rebase the vertical branch on top of the horizontal branch, and we find the conflict marked with #:

***************************************************************************
*..........................................................................
*..........................................................................
*................#.........................................................
*..........................................................................
*..........................................................................
*..........................................................................

We currently autofill two blocks by computing all of the commits shown as x here:

***************************************************************************
*...............x.........................................................x
*...............xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
*...............x#.........................................................
*...............x..........................................................
*...............x..........................................................
*xxxxxxxxxxxxxxxx..........................................................

Note that it is the two long, horizontal edges that are expensive. The vertical edges are relatively cheap because the vertical dimension is so much shorter than the horizontal.

What if, instead, we computed the following:

***************************************************************************
*...............xx.........................................................
*...............xx.........................................................
*...............x#.........................................................
*...............x..........................................................
*...............x..........................................................
*...............x..........................................................

This is enough to let the user resolve the conflict at #. Then we could continue autofilling that column in the vertical direction:

***************************************************************************
*...............xx.........................................................
*...............xx.........................................................
*...............x#.........................................................
*...............xx.........................................................
*...............xx.........................................................
*...............xx.........................................................

We have effectively rebased the vertical branch over to the right, thereby moving past a commit on the horizontal branch that conflicted with it. We could continue from there by looking for the next conflict.

The advantage of this approach is that we only have to compute O(min(M,N)) commits, rather than O(M + N). If M and N have very different sizes, this is a dramatic improvement. A disadvantage is that we lose the apex consistency check that I mentioned earlier, though we could still do a consistency check between the incrementally computed apex commit and the apex commit computed directly.

Of course this only works if our goal is to merge or to rebase the vertical branch on top of the horizontal branch. If our goal is to rebase the horizontal branch on top of the vertical branch, then there is no way around computing the merges on the bottom edge of the diagram. But even in this case, we could avoid computing additional horizontal lines in the diagram.

This approach seems very promising.

Oct 30 '15 09:10 mhagger

This approach seems very promising.

It does sounds promising! I'm using git-imerge for the first time, and the above filling strategy sounds like it would be helpful for my case too. I don't know git internals, nor git-imerge's implementation, how much learning curve is involved in attempting to implement the above strategy?

Jan 22 '16 23:01 kastiglione

@kastiglione: the code that defines/implements the filling strategy is a bit scattered around, but there is a precedent for variations in strategy influenced by the --goal=GOAL and --manual options. Let me do a brain dump of code that is related to the strategy:

Block.auto_expand_frontier currently is the main point for deciding between strategies for auto-filling and auto-expanding the frontier.
MergeFrontier.map_known_frontier only considers a block to be within the frontier if its right and bottom edges are both known. This requirement would have to be relaxed in the case of the new strategy.
MergeFrontier.auto_expand probably doesn't care much about strategy itself, though it recurses into Block.auto_expand_frontier.
Block.auto_outline fills in the right and bottom edges of the block.
Block.auto_outline_frontier implements the usual default strategy.
MergeState.auto_complete_frontier probably also doesn't need to care about strategy.
MergeState.simplify_to_rebase has a consistency check (the if not force block) that I think would have to be skipped in the case of the new strategy.

Ideally, as part of this change, the choice of strategy would be more decoupled from the rest of the classes. But that might be overkill.

Let me know if you have any other questions.

Jan 27 '16 10:01 mhagger

Is there any progress on this? git-imerge has taken just over a week of CPU time for me in my most recent merge, and that's getting to the point where it's not a productivity win. Do all of the NxM merges have to be computed ahead of time, can't we compute them lazily? Once the bisect has shown that revision A of mine conflicts with B from upstream, we already (via the binary search) know that A-1 merges cleanly with B and B-1 merges cleanly with A, so can't we just do the B-1, A-1 merge, then merge B and A into that and then try to merge those two, rather than computing all of the intermediate steps that are never used?

Nov 29 '16 13:11 davidchisnall

@davidchisnall: The complication with your idea is that any skipped merge that supposedly should be conflict-free can actually turn out to conflict (for example, if a commit was reverted on one of the branches). In this case some backtracking is necessary. The backtracking can end up requiring the user to do a manual merge that should really be an ancestor of (A, B). It's not impossible, but the bookkeeping becomes pretty arduous, and it would sometimes require manual merges to be done out of order, which I think would be very confusing to users. Moreover, the details of how the user resolves the "later" conflict then later resolves the "earlier" conflict might differ in detail, resulting in more confusion.

If we were willing to lean more on these assumptions and accept some potentially very awkward effects when the assumptions are violated, then I suppose we could avoid a lot of intermediate commits. The problem would become "given that we know the results of some set of (automatic and manual) merges, what additional (automatic and manual) merges do we have to do to make more progress toward the end goal?" The requirements would be:

A manual merge should only be requested after its two direct ancestors are available.
Automatic merges should always be done with the benefit of knowledge of any manual conflict resolutions that would be their ancestors. :point_left: This is the requirement that is impossible to fulfill in all backtracking cases without requiring the user to re-do merges that they have already done.

Less ambitiously, it might make sense to omit some of the "internal" lines when outlining the merge frontier; e.g., instead of computing

*************
*.....|.....|
*.....|.....|
*.....|.....|
*.....|.....|
*.....|-----+
*.....|#?????
*.....|??????
*.....|??????
*-----+??????

one could save a few explicit merges by computing

*************
*...........|
*...........|
*...........|
*...........|
*.....+-----+
*.....|#?????
*.....|??????
*.....|??????
*-----+??????

On other fronts: I did some heavy refactoring over the last couple of days to prepare the ground for the rebase-style strategy described here; see branch polymorphic-frontiers. The new strategy isn't implemented yet but should be pretty easy now. As usual, the biggest hurdle is the lack of automated tests that could give confidence that the big refactoring hasn't broken anything. After the refactoring, it should be easier to play with other merge strategies, like those discussed above—you basically just need to implement a new MergeFrontier class.

Feb 13 '17 08:02 mhagger

Just tried imerge out today since I had an old branch that needed a merge from master and can confirm this issue of very slow autofilling. It's been running for a few hours and still going.

Could we add a --verbose flag which shows all executed git commands to see what commands are slow?

May 25 '17 05:05 BrendanAnnable

@BrendanAnnable: I don't have time to implement a --verbose option in the near future. Patches are welcome if you'd like to work on this yourself. Let me know if you need any pointers to get started.

If you are using Linux, you could run imerge under strace to see the Git commands that it is invoking and the timestamps when they were called:

strace -tt -e execve git imerge [...]

Under the assumption that git is consuming most of the time (which I think is almost certain), this could give you a good overview of where the time is going.

May 25 '17 07:05 mhagger

I know this issue is old but I've also run into this. One of the more attractive uses of git-imerge is to rebase some local changes (can number in the hundreds or thousands) on top of a new release upstream (which can be tens of thousands of commits since the last release). The autofill is definitely the overwhelming majority of the time, even accounting for manually resolving merge conflicts. Thanks to git-imerge those conflicts are usually pretty simple to resolve. But it's a drag that the autofills end up taking so long. On the plus side I can just let it run in the background and do other things, so it's still a productivity win (much less time spent resolving complex conflicts).

Even a 50% improvement would be a lot because it's multiplied over tens of thousands of merges. This is definitely worth optimizing as much as possible. That said, I don't think I'd want things optimized to the point of losing helpful verifications. Anything that results in more complex manual work to figure out what happened is a bad tradeoff IMO.

Maybe it would be possible to run aggressively and then (optionally) backtrack to the slower algorithm if something goes wrong? Then people could choose their point on the speed/complexity continuum.

Jun 23 '22 22:06 david-greene-cb

I've also just run in to this exact issue when doing a 352x33 merge. The initial autofill took 5 hours.

@mhagger I see that you merged the above mentioned polymorphic-frontiers branch in 2020. Have you tried creating the new MergeFrontier class yet?

Sep 30 '22 14:09 jrosenberger

@jrosenberger Long ago I started hacking on it but never finished it. I don't even remember how far I got. In case you're curious, I just pushed those WIP commits to branch wip-columnwise-merge-frontier.

Oct 05 '22 13:10 mhagger

same issue for me. Great tool, just wish its faster

Aug 08 '23 08:08 tu-nv