Streamline support for Fork based workflows
Currently the recommended approach for dealing with fork based workflows is to do the following.
If I am was working on an open source project where I have a fork of the official repository.
I would have my local clone of my fork with main as the local branch and origin/main as the configured tracking branch. Then I would have separate remote named upstream that would point to official repository.
My stack of changes would be on main and would effectively be the commits from origin/main to main.
Ideally you would want to probably request review with gps rr <patch-index> however currently this would create a pull request to your origin/main which you DO NOT want.
To work around this at the moment instead of doing a request review you can do a gps sync <patch-index> which will created the request review branch for you and then you can go to GitHub, Bitbucket, etc. and manually create a Pull Request from that branch into the upstream of the fork.
Then once your PR is accepted and integrated into the fork's upstream main you would pull changes down from it using the following.
git fetch --all
git push -f origin upstream/main:main
gps pull
Looking at this workflow it seems like there are two areas we could probably improve this pretty easily.
- Add support to the
gps rrcommand to specify a different base branch. There is actually already a ticket for this, #19 . - Add support to the
gps pullcommand to be able to "match" the remote tracking branch to a specified branch, e.g.gps pull -m upstream/main#133
i was looking into how this would work, but it looks to be a bit more complicated for what data would be passed to the hook.
If we assume we've got:
- An upstream owner:
UPSTREAM_OWNER - An upstream repo:
UPSTREAM_REPO - An upstream trunk we want to merge into:
TRUNK - A forked owner:
ORIGIN_OWNER - A forked repo:
ORIGIN_REPO - A forked branch we want to merge:
FORK_BRANCH
Then what we do for Bitbucket, GitHub, and GitLab is all different:
-
For Bitbucket, you:
- Open a PR against upstream:
POST /2.0/repositories/{UPSTREAM_OWNER}/{UPSTREAM_REPO}/pullrequests - And provide a body like:
{ "destination": { "branch": { "name": "TRUNK" } }, "source": { "branch": { "name": "FORK_BRANCH" }, "repository": { "full_name": "ORIGIN_USER/ORIGIN_REPO" } } }
- Open a PR against upstream:
-
For GitHub, you:
- Open a PR against upstream:
POST /repos/{UPSTREAM_OWNER}/{UPSTREAM_REPO} - And provide a body like:
{ "base": "TRUNK", "head": "ORIGIN_REPO:FORK_BRANCH" }
- Open a PR against upstream:
-
For GitLab, you:
- Open a MR against the fork:
POST /projects/{ORIGIN_OWNER}%2F{ORIGIN_REPO}/merge_requests - And provide a body like:
{ "source_branch": "FORK_BRANCH", "target_branch": "TRUNK", "target_project_id": "UPSTREAM_ID" }
- Open a MR against the fork:
It's totally possible to figure these things out. But it seems like it'd take a bit more work to actually let the hooks be reusable instead of being specific to GitHub. In particular, it seems that GitLab uses numeric ids instead of the owner/repo approach for the upstream. It might actually accept both, but it's kind of unclear. In any case, that's kind of a think a hook specific to GitLab would have to figure out.
The question I guess is where does the logic live? If the logic needs to live in the hooks, then every hook is going to have to re-figure out how to interpret something like upstream/main and main into the six variables it needs (FORK_BRANCH, TRUNK, etc.).
If the logic lives in gps, then It seems like the hooks would need to have seven! arguments given:
-
RE_REQUESTING_REVIEW(required, this already exists) -
FORK_BRANCH(required, this already exists) -
TRUNK(required, this already exists) -
UPSTREAM_OWNER(optional, this would be new) -
UPSTREAM_REPO(optional, this would be new) -
ORIGIN_OWNER(optional, this would be new) -
ORIGIN_REPO(optional, this would be new)
Seven arguments is a lot. So, I recognize that this might seem overkill. I'm not sure how else to keep a separation between the methodology and the hook short of introducing some ad-hoc encoding scheme. But, this approach means that the logic only has to be figured out once and the hooks can case on whether they receive three arguments or seven.
I didn't look into the gps pull side, so I dunno if there's similar issues there. But I was considering making a PR for this, and it got a bit complex so I wanted to see what your thoughts were.
Oh, I guess another thing to mention is that the gh CLI works in a "smart" way in that you decide up front whether you're making PRs against origin or upstream and it does the rest for you. So in the case where someone is using gps on a forked GitHub repo and also using the gh CLI in the hook, the first step is kind of already done if they setup gh properly. On the flip side, if they don't set it up properly, or if they aren't using gh, it's back to square one.
So not that this is a 100% ideal solution. But gps has repository specific hooks so that things like this can be currently hard coded into the hook.
Maybe it would make sense to provide a shell library that can be used in the hooks. That way people can just call a function and pass hard coded values in for the service specific differences, and pass in the shared variables from gps.
As a general rule of thumb gps shouldn't know anything about GitHub, Bitbucket, GitLab or any other SCM hosting provider. All gps should know about is Git.
At least that is the separation that I am trying to maintain.
Would it be possible to configure the push remote as a workaround for git-ps branches? It would still something that would have to be configured, but at least it should be possible to do so only once (if your remotes are named consistently across repos).
Yeah, we could always add a repos specific configuration for this as well that we could use to address gps rr working with the correct remote.
The current recommended workflow is as follows and assumes that you have a fork of a repository which origin maps to.
If we have a patch stack setup from origin/main and main we can develop our patches directly on top of it as usual with gps. Once we have a patch or series of patches ready we need to create a branch and push it up to origin. With gps this is as easy as the following.
gps branch -p -n some-branch-name <start-patch-index> [end-patch-index]
The above command will create a local branch from the patch(es) and push it up to the remote, in this case origin. Then you simply create a pull request from that branch on your fork to the upstream repository manually using Bitbucket or GitHub.
If your PR is approved and integrated into main of upstream. Then you would update your your origin/main based on upstream/main. This would look something like the following.
git fetch upstream
git push origin upstream/main:main
gps pull
This will collapse the integrated patch(es) out of your stack as they are now included in origin/main.
If your PR wasn't approved and they had some comments you simply modify your patches with gps as you normally would and then use the gps branch command as follows again with the same branch name.
gps branch -p -n some-branch-name <start-patch-index> [end-patch-index]
This will update the branch with the updated version of the patch(es).
The above is what we should probably look at simplifying in some way to better support this workflow.