cli icon indicating copy to clipboard operation
cli copied to clipboard

`gh pr checkout` adds commits to current branch

Open driesvints opened this issue 2 years ago • 5 comments

Describe the bug

gh version 2.32.1 (2023-07-24)
https://github.com/cli/cli/releases/tag/v2.32.1

In a scenario where a PR that's being sent in from a remote which has the same branch name as the the branch it's being merged into, the commits that are pulled using gh pr checkout are added to the current active branch instead of a new branch being made.

Steps to reproduce the behavior

  1. Take PR https://github.com/laravel/cashier-paddle/pull/205
  2. Clone the laravel/cashier-paddle repo and checkout the master branch
  3. Run gh pr checkout 205
  4. The commits are added to the current active master branch instead of a new branch being created

Expected vs actual behavior

I'd expect a new branch being created which is isolated and its HEAD targets the remote branch of the contributor's repo. Otherwise you could accidentally push to your repo's branch instead of the PR branch which is quite dangerous if the PR isn't ready yet.

Logs

master $ gh pr checkout 205
From github.com:laravel/cashier-paddle
 * branch            refs/pull/205/head -> FETCH_HEAD
Updating 8958784..62e27e3
Fast-forward
 src/Http/Controllers/WebhookController.php |  6 +++---
 src/Transaction.php                        |  8 ++++++++
 tests/Feature/WebhooksTest.php             | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 3 deletions(-)

driesvints avatar Nov 27 '23 13:11 driesvints

Btw, I realise you can simply do gh pr checkout 205 -b fixes. I just was surprised that the above was the default behaviour.

driesvints avatar Nov 27 '23 13:11 driesvints

@driesvints : thank you for opening up this issue! ❤ I've gotta say that is certainly surprising; I'm starting to gather some initial thoughts for prioritization. 💪

Follow ups

After doing a bit of digging, I think I understand what is going on and why. I'd like to figure out how we can make gh pr checkout behavior clearer for other users. 🙇

Under the covers, gh pr checkout is one of the handful commands that GitHub CLI will manage state of a git repository. In this case, gh pr checkout is saying "put this repository in the same state as this pull request". The command is executing git merge --ff-only under the covers in the situation where you have the branch checked out locally.

This has the benefit of only updating your local repository in a non-conflicting way.

Aside from the -b,--branch flag, there is also --detach flag for checking out the PR commit directly rather than as a branch.

Going down the rabbit hole ...

Looking at https://github.com/laravel/cashier-paddle/pull/205, we have laravel:master being the base for karlomikus:master.

When checking out the code related to a PR, one of the concerns that needs to be addressed is whether the pull request crosses multiple repositories such as fork situations like the above.

https://github.com/cli/cli/blob/63094461621f8e91217d9b89f513f53e21670bd9/pkg/cmd/pr/checkout/checkout.go#L73-L126

One of the things this affects is how gh pr checkout checks out the branch locally. Repositories with an existing remote will use a local branch with the same name (for example master) whereas those without an existing remote will avoid clashing with a local branch by prefixing it with the head repository name (for example karlomikus/master)

https://github.com/cli/cli/blob/63094461621f8e91217d9b89f513f53e21670bd9/pkg/cmd/pr/checkout/checkout.go#L141-L173

https://github.com/cli/cli/blob/63094461621f8e91217d9b89f513f53e21670bd9/pkg/cmd/pr/checkout/checkout.go#L175-L212

In both cases, the resulting local branch has remote changes applied via git merge --ff-only if the local branch exists.

Where do we go from here?

Given all of this, are you still concerned that gh pr checkout has the opportunity to introduce destructive changes to local repository branches?

I'd expect a new branch being created which is isolated and its HEAD targets the remote branch of the contributor's repo. Otherwise you could accidentally push to your repo's branch instead of the PR branch which is quite dangerous if the PR isn't ready yet.

Given this PR has already been merged in, I could use your in recreating this scenario in a repeatable process if you are still concerned there is an edge case.

andyfeller avatar Nov 27 '23 20:11 andyfeller

Thanks a bunch for digging into this @andyfeller!

Given all of this, are you still concerned that gh pr checkout has the opportunity to introduce destructive changes to local repository branches?

Yeah I am. I believe the checked out branch should always be prefixed by the head repository name if it's a fork, that way you almost have no chance of a collision.

I can't seem to reproduce it... I tried creating a new repo: https://github.com/driesvints/github-cli-issue-8383 Then fork it and make a change on same branch: https://github.com/fullstackbelgium/GitHub-cli-issue-8383-fork/tree/main Make a PR: https://github.com/driesvints/github-cli-issue-8383/pull/1 But when I do gh pr checkout 1 it creates a new branch with the head repository name as a prefix:

Screenshot 2023-11-28 at 10 20 04

This should be the exact same scenario. I don't know how to reproduce this sorry.

PS. sidenote: I wasn't able to fork the repo to my own account even when I had a different repo name which is odd... So I forked it to my fullstackbelgium org for now.

Screenshot 2023-11-28 at 10 18 07

driesvints avatar Nov 28 '23 09:11 driesvints

Yeah I am. I believe the checked out branch should always be prefixed by the head repository name if it's a fork, that way you almost have no chance of a collision.

That honestly is my preference in general because of the inconsistent behavior.

I appreciate you trying to help figure out how to reproduce it as I believe there is a problem but want to make sure we can solve it. 🙇

andyfeller avatar Dec 04 '23 21:12 andyfeller

Firstly, huge thanks to @samcoe for helping me figure out how to reproduce this issue and summarizing notes from our session! ✨🤗

Steps to reproduce

  1. Fork a repo
  2. Create a PR from the fork default branch to the origin default branch
  3. Locally clone the origin repo and checkout the default branch
  4. In the local repo add the fork repo as a remote
  5. Run gh pr checkout with the PR created in step 2
  6. Watch as local default branch is updated with the commits from the PR

The crux of the problem revolves around step 4 with how we are trying to detect if the branch comes from a forked repository using available remotes to infer it and whether we have the right safeguards in place:

https://github.com/cli/cli/blob/57f6787c154d56d3c5461afed619e36b8a8394ec/pkg/cmd/pr/checkout/checkout.go#L99-L126

Digging deeper

Going back into repo history, the default branch protection has been around for a while if remote for the PR was missing while not being in place if the remote did exist, see https://github.com/cli/cli/commit/36397dbdc5defd12f3eaeb21307d592ed91d300d for example as a part of https://github.com/cli/cli/pull/302 and https://github.com/cli/cli/issues/233.

The presence of the remote diverges into different logic for managing the local branch with the path missing a remote prefixing the branch with the fork name:

    } else if pr.HeadRefName == defaultBranch {
        // avoid naming the new branch the same as the default branch
        localBranch = fmt.Sprintf("%s/%s", pr.HeadRepositoryOwner.Login, localBranch)
    }

https://github.com/cli/cli/blob/57f6787c154d56d3c5461afed619e36b8a8394ec/pkg/cmd/pr/checkout/checkout.go#L175-L191

https://github.com/cli/cli/blob/57f6787c154d56d3c5461afed619e36b8a8394ec/pkg/cmd/pr/checkout/checkout.go#L141-L155

What's next?

The naive thought would be to bring the same local branch namespace approach for fork repos in both code paths.

I think there is a little more history to glean from https://github.com/cli/cli/issues/233 as this spans multiple PRs and thoughts.

One example being some pseudocode written up in https://github.com/cli/cli/pull/451 that ultimately was closed but not merged:

SCENARIOS TO TEST: A. never checked out PR A1. cloned my fork of parent, never added parent A2. cloned my fork of parent, parent added as upstream A3. cloned my fork of parent, parent not added, but upstream remote present (for some reason) A4. cloned parent as origin, my fork added as fork (pr create setup) A5. cloned parent as origin, my fork added as something else entirely A6. -R specified B. have checked out PR already B1. cloned my fork of parent, never added parent B2. cloned my fork of parent, parent added as upstream B3. cloned my fork of parent, parent not added, but upstream remote present (for some reason) B4. cloned parent as origin, my fork added as fork (pr create setup) B5. cloned parent as origin, my fork added as something else entirely B6. -R specified A1 fails. We're not noticing that origin is a fork of something else and looking for a PR there. A2 succeeds. We set the remote for the branch appropriately and find the PR in upstream. A3 "fails" but, like, in that it tries to use the weird upstream and can't find PR. this is a weird case anyway. A4 works A5 works A6 works, but interestingly fails in a different way than A1. It's the graphql could not find PR instead of 'not found.' clearly, error wrapping needs to be improved.

andyfeller avatar Jan 17 '24 15:01 andyfeller