git-town icon indicating copy to clipboard operation
git-town copied to clipboard

ship: minimize / automatically resolve merge conflicts in stacks after shipping

Open kevgo opened this issue 1 year ago • 11 comments

Squash-merging within stacks can lead to phantom merge conflicts. To prevent them:

  1. before shipping: synchronize all descendents of the branch to ship
  2. after shipping: sync all these branches again

Now the merge conflicts after shipping should be minimal.

kevgo avatar Aug 10 '24 12:08 kevgo

Concern: Step 1 doesn't pull in new commits from the main branch. Hence step 2 might still pull in new changes from main that prevent automatic resolution.

Maybe Git Town can detect whether new commits get pulled in from main by memorizing the SHA of the main branch before the ship and after the sync?

  • No new commits happened --> automatically resolve the merge conflicts
  • New commits happened --> the user must resolve merge conflicts

kevgo avatar Aug 10 '24 12:08 kevgo

As a subset of this general concern, I've been regularly running into merge conflicts on files that were added in a squash-merged parent (Specifically squash-merged via GitHub's UI, not the git town ship command):

Example:

# Branch hierarchy:
# main
#   mball/PE-5944-update-build-action
#     mball/PE-5944-update-build-action2

# Squash-and-merge mball/PE-5944-update-build-action via GitHub UI, and delete the feature branch on GitHub

$ git town --version
Git Town 14.2.2

$ git town sync

# ...I deleted standard output to sync main and delete parent branch mball/PE-5944-update-build-action

[main] git checkout mball/PE-5944-update-build-action2
Switched to branch 'mball/PE-5944-update-build-action2'
Your branch is up to date with 'origin/mball/PE-5944-update-build-action2'.

[mball/PE-5944-update-build-action2] git merge --no-edit --ff origin/mball/PE-5944-update-build-action2
Already up to date.

[mball/PE-5944-update-build-action2] git merge --no-edit --ff main
Auto-merging .github/actions/get-args/action.yml
CONFLICT (add/add): Merge conflict in .github/actions/get-args/action.yml
Automatic merge failed; fix conflicts and then commit the result.

In this example, I added the file .github/actions/get-args/action.yml in the first branch (mball/PE-5944-update-build-action), and modified it in the second branch (mball/PE-5944-update-build-action2). Theoretically, git town should be able to automatically resolve this merge conflict by just using the version of the file in the second branch, since it should know at this point that main is identical to the first feature branch that has been squash merged by GitHub.

If I'm handling this situation manually, then if I knew that the parent feature branch was merged into main at <base-feature-branch-hash-in-main>, then I could run this merge command on the dependent feature branch to automatically handle the merge conflicts in favor of the second feature branch:

git merge <base-feature-branch-hash-in-main> --strategy ours

mball-agathos avatar Aug 13 '24 16:08 mball-agathos

That's indeed a pretty annoying GitHub issue. A few more details are in this thread: https://github.com/git-town/git-town/issues/3236#issuecomment-2021777211. If you (or anybody you know) figures out a better solution to ship branches in a stack, please share!

kevgo avatar Aug 13 '24 17:08 kevgo

I found a tweak to the existing git town flow that would have resulted in no merge conflicts. Basically the fix is to merge the parent feature branch into the child one last time instead of directly merging main into the child. Here is the existing flow (elaborating on the example above, and with most of the noise deleted):

$ git town sync
[mball/PE-5944-update-build-action2] git fetch --prune --tags
[mball/PE-5944-update-build-action2] git checkout main
[main] git rebase origin/main
[main] git checkout mball/PE-5944-update-build-action
[mball/PE-5944-update-build-action] git merge --no-edit --ff main
[mball/PE-5944-update-build-action] git checkout main
[main] git branch -D mball/PE-5944-update-build-action
[main] git checkout mball/PE-5944-update-build-action2
[mball/PE-5944-update-build-action2] git merge --no-edit --ff origin/mball/PE-5944-update-build-action2
[mball/PE-5944-update-build-action2] git merge --no-edit --ff main
Auto-merging .github/actions/get-args/action.yml
CONFLICT (add/add): Merge conflict in .github/actions/get-args/action.yml
Automatic merge failed; fix conflicts and then commit the result.

Here's a proposed tweaked flow that would not have had any merge conflicts:

$ git town sync
[mball/PE-5944-update-build-action2] git fetch --prune --tags
[mball/PE-5944-update-build-action2] git checkout main
[main] git rebase origin/main
[main] git checkout mball/PE-5944-update-build-action
[mball/PE-5944-update-build-action] git merge --no-edit --ff main
[mball/PE-5944-update-build-action] git checkout main
# don't delete parent branch yet
[main] git checkout mball/PE-5944-update-build-action2
[mball/PE-5944-update-build-action2] git merge --no-edit --ff origin/mball/PE-5944-update-build-action2

# new changes -- merge in parent branch first
[mball/PE-5944-update-build-action2] git merge --no-edit --ff mball/PE-5944-update-build-action
[main] git branch -D mball/PE-5944-update-build-action

The critical change in this flow is to hold on to the parent feature branch (i.e., mball/PE-5944-update-build-action in this example), and merge it into the child feature branch (i.e., origin/mball/PE-5944-update-build-action2) and then delete the parent afterward. By merging from the parent branch, there is no longer a merge conflict due to both of them adding and modifying the same file.

mball-agathos avatar Aug 13 '24 19:08 mball-agathos

Git Town 16 provides an elegant solution to this problem: fast-forward merging.

kevgo avatar Aug 31 '24 13:08 kevgo

Good to know! I'll try out Git Town 16's updated git town ship command to see how it handles this use-case! One thing I'll be curious about is whether there's a good way to clean up the local Git state if I (accidentally) press "Squash and Merge" on the GitHub pull-request web page instead of using git town ship on the CLI.

mball-agathos avatar Sep 03 '24 20:09 mball-agathos

@mball-agathos I'm also experiencing the same issue, I'm getting phantom merge conflicts because my colleague squash-merged via GitHub UI. Have you found a solution to reduce the conflicts locally when that happens?

ceilfors avatar Sep 26 '24 11:09 ceilfors

@ceilfors can you post the output of git sync? In particular, I'm curious whether Git Town 16 is still merging main into the parent branch before deleting it. If you can identify the hash of the deleted parent branch and then merge that into the child branch first, that should help resolve your merge conflicts automatically.

(I haven't yet encountered this particular scenario with Git Town 16, so I haven't tried it personally yet, but it shouldn't be too hard to come up with a scenario that simulates the issue using notes from this thread)

mball-agathos avatar Sep 26 '24 15:09 mball-agathos

I'm curious whether Git Town 16 is still merging main into the parent branch before deleting it.

Git Town 15.0 changed the behavior of git ship. It no longer modifies the branch being shipped, as doing so would trigger an additional CI run and require a fresh code review. Instead, git ship now delivers exactly what was already reviewed and tested.

A good way to check what commands Git Town is running are the human-readable end-to-end tests. Here's the one for shipping the oldest branch in a stack using the squash-merge ship strategy: https://github.com/git-town/git-town/blob/main/features/ship/squash_merge/current_branch/stacks/parent_branch.feature

kevgo avatar Sep 29 '24 14:09 kevgo

The issue with automatic resolution of phantom merge conflicts caused by squash-merging is still an ongoing challenge. We've had some great discussions in this thread, and a few promising approaches have been proposed that could bring us closer to a solution.

Approach 1: Detecting Phantom Merge Conflicts

One idea is to detect whether a merge conflict is actually a phantom merge conflict and automatically resolve it if so. Here's a rough outline of how that might work:

  • Git Town compares the conflicting file on the main branch to the same file on the previous parent branch. This check should also include metadata like file attributes and permissions.
  • If the versions are identical, it's a phantom conflict, and we can safely resolve it by favoring the version on the feature branch.
  • If the versions differ, it's a genuine conflict, which means external changes have been introduced, and the user will need to resolve this manually.

For this to work, Git Town would need to know the previous parent branch and its SHA. This information is available when syncing a stack whose oldest branch was just shipped!

Approach 2: Avoiding Phantom Merge Conflicts

Another approach would be to prevent phantom conflicts in the first place. This could be done by merging a just-shipped branch one more time into its descendant branches when detecting that it was shipped remotely.

kevgo avatar Sep 29 '24 18:09 kevgo

Approach 2 would be better but will probably only work for the merge sync strategy, not the rebase and compress strategies. Because in the latter cases, there are different commits with different SHAs, and from Git's perspective they don't have a relationship with each other.

So approach 1 seems to way to go here.

kevgo avatar Oct 04 '24 17:10 kevgo

@kevgo I'm just getting back to this again and ran into a merge conflict with two stacked GitHub PRs after running git town ship on the parent branch (in Git Town 16.4.1). In this particular example, I'm running into issues with "both added" and "both modified" git merge conflicts. Also, I noted from my git log, that the approach I suggested earlier of merging in the parent branch first (the one just shipped) would not alone have fixed my merge conflicts -- it's necessary to also use --strategy ours.

With regard to Approach 1, I agree that this is better than Approach 2, but I'm also afraid that the solution of detecting merge conflicts is getting outside of git town's wheelhouse and could be a can of worms (in particular, trying to handle all the permutations of possible types of merge conflicts). Instead, I'm wondering if there could just be an approach that works from the foundation of identifying git hashes with exactly matching file contents (i.e., git diff returns zero changes between the two hashes), and then using a git merge --strategy ours.

The general approach would be as follows:

  1. Merge the latest version of the parent branch into the child branch. This itself can be a bit tricky and involves a bunch of edge cases. But at the end of this step, the child branch should have an ancestor in its git log whose file contents exactly match that of a commit on the main branch (assuming that we're using GitHub Squash and Merge, or the equivalent on other hosts).
  2. Confirm (possibly via a git diff --nameonly hash1 hash2) that the contents of the (probably now deleted) parent branch matches a commit on the main branch (the "Squash and Merge" commit, also the merge_commit_sha as returned from the GitHub Get a Pull Request API).
  3. Merge this commit from the main branch into the child branch using git merge --strategy ours.
  4. Confirm that this merge was essentially a no-op on the child branch by seeing that git diff HEAD~ returns no results.

With regard to step 1 above, the git town ship command would ideally merge the parent branch into all its children so that subsequent updates can be performed without having to resolve merge conflicts. Since it's common to have GitHub configured to automatically delete the feature branch after merging a pull request, it can be difficult to get this latest feature branch without some kind of workflow that temporarily restores the feature branch so that it can be fetched. But if someone is using git town ship, then it's possible to make sure that the local repository has the latest version of the parent feature branch before it gets remotely deleted.

mball-agathos avatar Oct 24 '24 18:10 mball-agathos

Another approach that I just verified is to merge main into the parent first, and then merge this into the child. This creates an extra merge commit in the child branch's history, but avoids the need to invoke git merge --strategy ours.

*   1eb6d58e9 2024-10-24 [child] Merge commit '96d247d5d' into 'child'
|\  
| *   96d247d5d 2024-10-24 [detached] Merge branch 'main' into (now deleted) 'parent' branch
| |\  
| | * 7a6a81140 2024-10-24 [main] Parent feature as squash-merged into main (exactly matches hash 9bbba2da6)
* | | c17e3816b 2024-10-23 [child] Merge branch 'parent' into 'child'
|\| | 
| * | 9bbba2da6 2024-10-23 [parent] Merge branch 'main' into 'parent'
| |\| 
* | | 1bf462efb 2024-10-23 [child] Child feature (squashed to one commit)
|/ /  
| * bc730d7ff 2024-10-23 [main] Some other commit to main that will need to be merged later
* | b697ff8a2 2024-10-23 [parent] Parent feature (squashed to one commit)
|/  
* 47c33f7ee 2024-10-22 [main] Base commit

In this example above, the parent branch has been squashed merged into main and deleted already. But in the child branch's history, we can identify which commit is associated with the parent branch. By first merging main into this parent branch commit (in a detached head state), it then becomes possible to merge that detached head into the child branch to avoid merge conflicts.

Since this approach creates two merge commits in the child branch, git town could use an approach where it just tries to directly merge main first into the child, and if this results in a merge conflict, then to abort the merge and do the two-step process of merging main into the parent first, and then into the child.

mball-agathos avatar Oct 24 '24 19:10 mball-agathos

My ideal situation would be Git Town rebasing child branches on top of the new main branch. After merging A, I don't want B to have A's commits visible in the PR, so I want to rebase B and C and force push.

Let's say that I have a stack of PRs A -> B -> C. Here's what I do manually that I would like Git Town to do for me.

(I'm using the following form of git rebase: git rebase [<options>] --onto <newbase> <upstream> <branch>)

  1. Merge A using the GitHub UI
  2. Run git fetch
  3. Run git rebase --update-refs --onto origin/main A C
  4. Run git push --force origin B C
  5. Run git branch --delete --force A (git branch -D A)

stephenwade avatar Oct 28 '24 22:10 stephenwade

Hi Stephen,

Your use case is already well-supported via the git town compress command. See https://www.git-town.com/commands/compress

Ideally, you'd first merge main into the child branch (possibly using ideas from this thread), then clean up with the compress command.

On Mon, Oct 28, 2024, 16:28 Stephen Wade @.***> wrote:

My ideal situation would be Git Town rebasing child branches on top of the new main branch. After merging A, I don't want B to have A's commits visible in the PR, so I want to rebase B and C and force push.

Let's say that I have a stack of PRs A -> B -> C. Here's what I do manually that I would like Git Town to do for me.

(I'm using the following form of git rebase: git rebase [] --onto )

  1. Merge A using the GitHub UI
  2. Run git fetch
  3. Run git rebase --update-refs --onto origin/main A C
  4. Run git push --force origin B C

— Reply to this email directly, view it on GitHub https://github.com/git-town/git-town/issues/3841#issuecomment-2442792472, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADA3XFIKGOH2GEVVVLSANTZ523A5AVCNFSM6AAAAABMJZZKQOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBSG44TENBXGI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

heisencoder avatar Oct 29 '24 00:10 heisencoder

Hi Matt,

I can see how I might accomplish this (maybe I could merge A, then compress A before running git sync -a). My point is that I want to be able to do this:

  1. Merge A on GitHub
  2. Run git sync -a

And the end result should be B and C rebased onto the new main.

stephenwade avatar Oct 29 '24 16:10 stephenwade

Hi Stephen,

The whole issue as described in this particular thread is essentially due to GitHub doing a rebase (i.e. "Squash and Merge") of (in this example) branch A onto main, which then potentially causes merge issues for B and C. Along these lines, I've found it's a little easier to use merge instead of rebase to fix the issue first, and then (if desired) let git town perform a cleanup rebase later using the compress command. Prematurely doing a rebase erases all the merge hints that help git in deciding how to resolve the merge conflicts.

Having git town sync perform a rebase by default is a little more dangerous, and also messes up GitHub Pull Requests that are currently in review (because it removes the commits that have review comments). It's better to let git town sync do merges. Later, (before the Pull Requests are sent out for review) you can run git town compress --stack if you prefer each Pull Request to be in one commit.

mball-agathos avatar Oct 29 '24 16:10 mball-agathos

I understand the discussion in this thread is about "Squash and merge". I use this GitHub setting with the Git Town setting sync-feature-strategy set to "rebase".

I understand that it would be easier to use the merge strategy. However, after A is merged, then all of A's commits appear in B's pull request. compress doesn't get the result I want because I want all of B's individual commits to be reviewed.

My comments are about what Git Town can do when the user tells us they are okay with rebasing.

stephenwade avatar Oct 29 '24 16:10 stephenwade

I added end-to-end tests that reproduce phantom merge conflicts in https://github.com/git-town/git-town/pull/4181. An interesting insight is that phantom merge conflicts don't happen with the rebase sync-feature-strategy - at least I wasn't able to produce them. I can produce them for the merge and compress sync strategies. Maybe that's because my machine has rerere enabled?

kevgo avatar Oct 30 '24 21:10 kevgo

My ideal situation would be Git Town rebasing child branches on top of the new main branch. After merging A, I don't want B to have A's commits visible in the PR, so I want to rebase B and C and force push.

@stephenwade isn't this how the existing rebase strategy works? See the end-to-end test as a working example.

  • before we have this stack:

    main
     \
      alpha  (contains "alpha commit")
        \
         beta  (contains "beta commit")
    

    branch alpha contains alpha commit branch beta contains beta commit

  • branch alpha gets shipped at the remote

  • we run git sync

    • Git Town runs git rebase main --no-update-refs on the beta branch
    • then it runs git push --force-with-lease --force-if-includes on that branch
  • now we have this hierarchy:

    main  (contains "alpha commit")
     \
      beta  (contains "beta commit")
    

    branch main contains alpha commit branch beta contains only beta commit

Am I missing something?

kevgo avatar Oct 30 '24 22:10 kevgo

@mball-agathos thanks for all the super helpful research. I agree; handling the various types of merge conflicts can be complex. That said, I believe that the phantom merge conflicts we care about in this ticket should be relatively straightforward to identify.

  • The error message verbs should match: e.g., CONFLICT (add/add) or CONFLICT (change/change). Any other pairing, like (add/change), indicates a true merge conflict rather than a phantom one.
  • As far as I know, the only types of changes are add, remove, and change.
  • We’re using your approach of diffing the conflicting file on the main branch and the shipped branch to identify conflicts.

I'm okay if Git Town fixes only a subset of phantom merge conflicts for now. My primary concern in this ticket is managing phantom merge conflicts when shipping the oldest branch in a stack.

kevgo avatar Oct 31 '24 00:10 kevgo

Using Git's content-addressable storage system makes it easy to compare file versions on different branches.

Show the status of all conflicting files after encountering a merge conflict: git ls-files --unmerged

Example output:

100755 c887ff2255bb9e9440f9456bcf8d310bc8d718d4 2	file
100755 ece1e56bf2125e5b114644258872f04bc375ba69 3	file
  • The first column is the file permissions. They must match in phantom merge conflicts.
  • The second column is the SHA1 of the file content (blob). This is not the commit SHA, it's the checksum of the file content in a commit.
  • The third column is the stage number.
  • The fourth column is the path of the conflicting file.
  • Stage 1 (not in the example output) is the common ancestor (base version).
  • Stage 2 (c887ff in the example output) is the the local file (on the current branch)
  • Stage 3 (ece1e5) is the version being merged in (in this example from the parent branch)

To see the SHA1 of the content blob of a file on a particular branch:

git ls-tree <branch> <file-path>

For example, to see the content blob SHA1 of file file on branch main:

git ls-tree main file

Example output:

100755 blob ece1e56bf2125e5b114644258872f04bc375ba69	file

The file content is identical if the content checksum (SHA1 of the content blob) is identical. In the example above, the SHA1 of the incoming file version (stage 3, from the parent branch) is identical to the SHA1 of that file on the main branch: ece1e5. This means it's a phantom merge conflict.

kevgo avatar Oct 31 '24 01:10 kevgo

Thanks @kevgo for the insights and information on using both git rerere and ls-tree! I look forward to trying out this fix in the next release!

mball-agathos avatar Oct 31 '24 16:10 mball-agathos

I want to rebase B and C and force push.

isn't this how the existing rebase strategy works?

@kevgo Yes, almost.

The existing method doesn't work if there are commits in alpha that modify the same lines.

When we run git rebase main on the beta branch, it tries to apply all the commits from alpha and only drops them if it finds all the patch contents on main.

~/squash-test beta
❯ git rebase main
dropping ef0b43755d18a1e4b5c2cd42f2f048e79fc81a86 alpha commit 1 -- patch contents already upstream
dropping b409c199c648d316f8d4a965adc093cb6f735c46 alpha commit 2 -- patch contents already upstream
Successfully rebased and updated refs/heads/beta.

If multiple commits modify the same lines, it won't be able to find all the patch contents on main and it will report a merge conflict.

~/squash-conflict beta
❯ git rebase main
Auto-merging favorite-fruit
CONFLICT (add/add): Merge conflict in favorite-fruit
error: could not apply 4f905d5... alpha commit 1
Could not apply 4f905d5... alpha commit 1

My suggestion is to run git rebase alpha --onto main on the beta branch. This only tries to apply the commits that belong to beta (the commits that would be logged by git log alpha..beta), avoiding the need to resolve any phantom conflicts.

~/squash-avoid-conflict beta
❯ git rebase --onto main alpha
Successfully rebased and updated refs/heads/beta.

Here's the full list of commands I wrote to produce each example.

mkdir squash-test; cd squash-test
git init
git commit -m 'initial commit' --allow-empty
git switch -c alpha
touch file1; git add .; git commit -m 'alpha commit 1'
touch file2; git add .; git commit -m 'alpha commit 2'
git switch -c beta
touch file3; git add .; git commit -m 'beta commit 1'
touch file4; git add .; git commit -m 'beta commit 2'
git switch main
git merge --squash alpha  # Squash commit -- not updating HEAD
git commit -m 'Squash alpha branch'
git branch -D alpha
git switch beta
git rebase main
# dropping ef0b43755d18a1e4b5c2cd42f2f048e79fc81a86 alpha commit 1 -- patch contents already upstream
# dropping b409c199c648d316f8d4a965adc093cb6f735c46 alpha commit 2 -- patch contents already upstream
# Successfully rebased and updated refs/heads/beta.
mkdir squash-conflict; cd squash-conflict
git init
git commit -m 'initial commit' --allow-empty
git switch -c alpha
echo 'apple' > favorite-fruit; git add .; git commit -m 'alpha commit 1'
echo 'peach' > favorite-fruit; git add .; git commit -m 'alpha commit 2'
git switch -c beta
echo 'pepperoni' > favorite-pizza; git add .; git commit -m 'beta commit 1'
echo 'pineapple' > favorite-pizza; git add .; git commit -m 'beta commit 2'
git switch main
git merge --squash alpha  # Squash commit -- not updating HEAD
git commit -m 'Squash alpha branch'
git branch -D alpha
git switch beta
git rebase main
# Auto-merging favorite-fruit
# CONFLICT (add/add): Merge conflict in favorite-fruit
# error: could not apply 4f905d5... alpha commit 1
# Could not apply 4f905d5... alpha commit 1
mkdir squash-avoid-conflict; cd squash-avoid-conflict
git init
git commit -m 'initial commit' --allow-empty
git switch -c alpha
echo 'apples' > favorite-fruit; git add .; git commit -m 'alpha commit 1'
echo 'peaches' > favorite-fruit; git add .; git commit -m 'alpha commit 2'
git switch -c beta
echo 'pepperoni' > favorite-pizza; git add .; git commit -m 'beta commit 1'
echo 'pineapple' > favorite-pizza; git add .; git commit -m 'beta commit 2'
git switch main
git merge --squash alpha  # Squash commit -- not updating HEAD
git commit -m 'Squash alpha branch'
git switch beta
git rebase --onto main alpha
git branch -D alpha
# Successfully rebased and updated refs/heads/beta.

stephenwade avatar Oct 31 '24 18:10 stephenwade

Updated criteria for detecting phantom merge conflict for an unmerged file path in a merge conflict:

  • permissions are the same
  • content blob SHA of file in the incoming change == content blob SHA of file on the main branch
  • the parent of the branch experiencing the merge conflict is not the main branch

kevgo avatar Oct 31 '24 19:10 kevgo

Phew, this is implemented now. Made the product quite a bit more complex, but I think that complexity is well spent given that they solve phantom merge conflicts, which are a showstopper problem.

kevgo avatar Nov 01 '24 22:11 kevgo

@stephenwade thanks for the detailed repro for the rebasing issue that requires --onto! I think I get the idea now. Will try to repro in an end-to-end test so that we can track the correct implementation of this. This thread has grown quite a bit, making it tough to parse through all the ideas being discussed. Since the original issue has been addressed, I’ll go ahead and split out the remaining topics into individual tickets so we can keep discussions more focused and actionable. Let’s continue the --onto discussion over in https://github.com/git-town/git-town/issues/4189. Thanks!

kevgo avatar Nov 01 '24 22:11 kevgo