git-imerge icon indicating copy to clipboard operation
git-imerge copied to clipboard

Add simplification strategy for merge with keeping commits for manual merges

Open eddyp opened this issue 12 years ago • 11 comments

Since git-imerge can track which merges were made automatically and which manually due to conflicts, and because manual merges can themselves be done incorrectly by the user, it could be useful to add a new simplification strategy (haven't found a good name) that will do the following transformation:

From a imerge complete diagram like this:

 *******
 *.m....
 *......

Where: * - original commits on the two branches . - automatic merge m - manual merge

The resulting commits will be:

 *******
 * m   .
 * .   .

The convention being that the unrepresented commits are not stored at all, and only the represented ones are finally kept. The parents of each of the kept merges are the closest kept commits to the left, and, respectively, up.

This way the problematic commits are highlighted in the history and can be reviewed individually later or tested.

eddyp avatar Jul 25 '13 09:07 eddyp

This is an interesting idea and I've also thought about something like this. I got stuck because I don't see an elegant way to generalize it to multiple manual merges; it seems to me that if there are N manual merges then the number of merges that would need to be retained goes something like N². (Though maybe that is acceptable and this type of simplification would be useful anyway.)

For concreteness, please draw a bigger diagram, with three or four manual merges scattered about, and suggest how you think the simplification should be done in that case.

mhagger avatar Jul 26 '13 12:07 mhagger

I see your point about the exponential growth of the commits. I will think about this.

The reason I didn't seriously consider this case was that in practice I only had very few commits on the master branch and a lot more on the feature branch, so the actual diagram was 'thin' and 'tall'. I didn't even encounter a case as complex as the one I drew.

Talking about the diagram I drew, what would you think about retaining from the intermediate merges only the manual one and the final one, with the final one having 3 parents? (The manual merge, the rightmost commit on the destination branch and the bottom most from the source).

I know is a little complicated for a human to follow more than 2 parents, but I want to get your opinion on this.

eddyp avatar Jul 27 '13 05:07 eddyp

What is the point of storing more intermediate merges? Is it so that git has enough history that it can use to do subsequent merges? Is it so that the whole incremental merge could in principle be reproduced? Is it so that a human looking at the history can understand it? Without understanding the purpose, it is hard to reason about what should be retained.

mhagger avatar Jul 27 '13 11:07 mhagger

The reason for me was to be able to later test the correctness of the manual merge. But with more conflicts I am unsure what should be retained. I have an example of a fictional merge on my computer. I'll post or when I'll get home, so we can discuss.

eddyp avatar Jul 28 '13 09:07 eddyp

Here is the simplified example merge diagram I was talking about.

* - * - * - * - * - * - * - * - * - * - *
|           |       |       |           |
*           |       |       |           |
|           |       |       |           |
* --------- m ----- a ----- a --------- b
|           |       |       |           |
* --------- a ----- 1 ----- n --------- b
|           |       |       |           |
*           |       |       |           |
|           |       |       |           |
* --------- a ----- o ----- 2 --------- b
|           |       |       |           |
* --------- b ----- b ----- b --------- F

The commits marked with m, n, o are the manual merges, the ones marked a or b are automatic merged that seem to be the ones that must be kept if my initial proposal is to be followed, F is the final tree state after all the merges are done. The commits marked with numbers (automatic) are the ones I am not sure about if they should be kept or not.

The 'a' merges should be kept because they are the ones generated by the m manual merge. The 'b' merges seem to be necessary because they are outputs of manual merges and are inputs for F (directly on indirectly).

What seems to be unclear to me is if the automatic merges marked with 1 and 2 should be kept as inputs for the manual merges and the final merge, respectively. Should 1 be input for the n and o manual merges? It does contain information as a result of the m merge, but what does make more sense, keeping 1 as parent for n and o, or to drop 1 and have only a nodes as parents for n and o?

The node 2 is similar to 1, but it shouldn't be problematic since is automatic.

I wonder if keeping only the m, n, o commits and their * parents (or previous manual merges as parents) makes more sense, while having the F commit multi-parent, with all the manual merges as parents (except the manual merges which have another manual merge as child).

What do you think?

eddyp avatar Jul 28 '13 21:07 eddyp

I got stuck in approximately the same place when I thought about this possible feature. My feeling is that all of the commits that you have indicated in your drawing should usually be preserved. It wouldn't be that hard to implement. The question is how useful it would be in practice.

mhagger avatar Oct 03 '13 04:10 mhagger

@eddyp: I just implemented this and pushed the result to branch simplify-to-manual. Please test it and see if it does what you want; if so, I will merge it to master.

mhagger avatar Oct 03 '13 06:10 mhagger

@eddyp, I just pushed a rebased version of this feature to the branch "simplify-to-manual". It would be great if you would try it out and let me know if it works for you.

mhagger avatar Oct 30 '13 08:10 mhagger

@mhagger : sorry for not replying earlier, I've been busy with other stuff. I'm unsure when I'll have the time to test this, but I'll send a reply with my results when I have the time for it. (Currently I don't remember in which project I used git-imerge, so I'll have to run into a repo that needs a merge, that's why I can't test right away.)

eddyp avatar Nov 03 '13 13:11 eddyp

@mhagger : I found a repo where I could try imerge and using the simplfy-to-manual branch version I got this weird result:

(much output before this)

*...|
*...|
*...|
*...|
*...|
*...|
*...|
**..|
*--*+

Key:
  |,-,+ = rectangles forming current merge frontier
  * = merge done manually
  . = merge done automatically
  # = conflict that is currently blocking progress
  @ = merge was blocked but has been resolved
  ? = no merge recorded

0 eddy@heidi ~/usr/src/make/make-profiler $ git imerge simplify
Cannot simplify to "manual" because merge 2-8 is not yet done

I'm a little confused. The matrix shows a few '?' at the top, but I don't know why, since I didn't skip any of the manual merges.

eddyp avatar Nov 03 '13 15:11 eddyp

It seems to me that you don't necessarily need to preserve 2 dependents for each manual merge (both downward and rightward), which is what this seems to propose. It's just important that each manual merge is included, somehow, in the result. So maybe we want to compute the minimal graph that keeps each of the manual merges connected, or something.

dabrahams avatar Sep 18 '20 23:09 dabrahams