capstone icon indicating copy to clipboard operation
capstone copied to clipboard

Clarify Capstone branching strategy

Open tmfink opened this issue 4 years ago • 4 comments

Historically, this project has used the active branches:

  • Past version branches (v2, v3, v4)
  • next: what will be the next release

The master is not really been used used, but is still the default HEAD branch. This confuses contributors.

As of writing:

  • master branch is c72fc8185ed4088c3486f621d150fbcf5f980aa0
  • next branch is f278de39c1e8a9fca977b8dfeed99d6d1f8b82bf

I propose Formalize the branching strategy of the project

  • [ ] Document branching strategy
  • [ ] Update repository HEAD (default) branch (if needed)
  • [ ] Deprecate any branches (if needed)
    • We can do a final commit to the README in that branch that describes the branch as deprecated

tmfink avatar Aug 29 '21 07:08 tmfink

My proposed branching strategy is to "support" branches:

  • next: contains all active development/new features
  • old version "vN" branches (v2, v3, ...)
  • deprecate master branch

Guidelines:

  1. All new features commits go into the next branch
  2. Bugs should be fixed in the "newest" branch branch first
    • As applicable, cherry-pick/merge bug fixes back to "vN" if they are still supported
  3. When we create a new major release of capstone, create the new "vN" from the next branch

tmfink avatar Aug 29 '21 07:08 tmfink

I have to agree that the names are confusing for developers new to Capstone. There have been so many issues opened where the solution was basically "use the next branch".

It makes sense to have an experimental / unstable / next - type branch when there is a high rate of commits. But the rate of commits here seems pretty slow. I think it would be less confusing for new people if next (or whatever it comes to be called) became the primary branch.

keenk avatar Sep 07 '21 03:09 keenk

@aquynh could you change the default branch to next? I see several PRs were mistakenly opened against master.

https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-branches-in-your-repository/changing-the-default-branch

tmfink avatar Nov 11 '21 06:11 tmfink

I will copy next to master soon, yes.

aquynh avatar Nov 11 '21 08:11 aquynh

@aquynh @kabeor since master is deprecated for a long time, and a lot of activity happened in next since, do you think it makes sense to make it default now?

XVilka avatar May 04 '23 09:05 XVilka

i have been thinking about renaming "master" to "master_old", then clone "next" to a new "master".

does this break anything?

aquynh avatar May 04 '23 10:05 aquynh

@aquynh another option would be to let master be, but to make a new branch - main, as the default branch name for Git and GitHub these days, and base it on the latest next. Then, main would represent a "stable" branch, which would allow picking up small patches for making patch releases, while keeping next the default branch. It's also possible to make next default branch on the GitHub in this strategy, but base releases on main, merging next into main when you think it's ready for the next release. How does it sound?

This way it would guarantee not to break anything that relied on master for one reason or another.

XVilka avatar May 04 '23 10:05 XVilka

sounds perfect to me.

aquynh avatar May 04 '23 10:05 aquynh

please see https://github.com/capstone-engine/capstone/issues/2011

we have the main branch as the default branch now.

aquynh avatar May 04 '23 10:05 aquynh

(Response added here instead of #2011 so I can quote @XVilka comment)

@aquynh another option would be to let master be, but to make a new branch - main, as the default branch name for Git and GitHub these days, and base it on the latest next. Then, main would represent a "stable" branch, which would allow picking up small patches for making patch releases, while keeping next the default branch. It's also possible to make next default branch on the GitHub in this strategy, but base releases on main, merging next into main when you think it's ready for the next release. How does it sound?

This way it would guarantee not to break anything that relied on master for one reason or another.

I don't think we want to do this since there would be no good way to update the main (latest stable) branch from next. The next branch is supposed to be the "next" version, so we would want the next branch history to just become the next main branch.

Our options would be:

  1. Do hard reset of stable to next when we pull a create a new major version.
    • This would regularly break everyone's Git checkouts and make it much more difficult to work with the project.
  2. Merge next into stable
    • This would cause crazy conflicts and we would just want to take everything from the next branch anyway.

Also, we want next to be the default branch because that's where "new development" should go. Only bug fixes should go into any "stable" branch. Also, bugs should be fixed in the forward development branch (AKA next) first. If necessary, bug fixes can be cherry-picked to an older branch.

Proposal

  • Stable versions have a branch name that indicates the version (like we have been doing)
  • Deprecate master and main
    • Delete main since it's so new and almost nobody will have checked it out
    • Rename master to master_old. Github automatically redirects URLs when we rename the branch. Causing breakage for this case would not be bad since this branch is long since deprecated.
  • Make next the default branch (repo HEAD)

Example

If the latest "stable" version is "6.X.X", then we would have branches:

  • next
    • Default HEAD branch; commits go here by default
    • New features for future major version "7.X.X" will go here
  • v6, v5, ...
    • Bug fixes for "stable" releases go here

When we release a new major version, such as "7.X.X", we would only need pull a new branch v7 from next.

To fix a bug in any version "train", we simply need to commit to the branch for the corresponding version.

I would be happy to document this process in a wiki page.

tmfink avatar May 04 '23 17:05 tmfink

the main branch is only updated when we release the new versions, and at that time we merge next to main, like we always did with the master branch before.

the reason I made master deprecated is because there were some mistakes so it is impossible to merge next to it without a lot of nasty fixes.

sorry that i don't get what you mean: what is wrong with the above plan?

aquynh avatar May 04 '23 17:05 aquynh

Sorry for not being clear. :smile:

the main branch is only updated when we release the new versions, and at that time we merge next to main, like we always did with the master branch before.

Would you only update main when releasing an new major version, or also for minor/patch versions? For example, would you update main when incrementing major version going from version "5.99.99" to version "6.0.0"? Would you update main when incrementing a patch version from version "6.0.0" to version "6.0.1"?

If you only update main for major versions, then main would always be very out of date compared to the version branch (like v6).

If you update main for every new release (including minor/patch versions), then when you jump major versions, the merge won't be clean. You will get a lot of conflicts because the code for "5.99.99" will be much different than the code for next that will correspond to "6.0.0". A git merge is meant to merge separate threads of development, but main would not be a thread of development--it would effectively be a pointer that gets updated.

Git branches exist so you can commit into them. It sounds like you will never commit directly into main but only do a fast-forward merge OR a git reset --hard.

It sounds like you want a "read only branch". Instead of using a branch, you could just create a "latest-stable" tag that you update on every release. For example, the neovim repo has a stable tag that gets updated for every release.

If you want folks to be able to easily determine the latest release but avoid committing to a main branch, then I recommend adding a stable tag. I think "stable" or "latest-stable" is a more descriptive name than "main" for the tag.

tmfink avatar May 05 '23 02:05 tmfink

@tmfink you can avoid conflicts during the merge using:

# when we are on the "next" branch
git checkout -b main-merge
git merge main -s ours
git checkout main
git merge main-merge --no-ff -m "Merge 'next' branch into 'main'"

XVilka avatar May 05 '23 07:05 XVilka

@tmfink you can avoid conflicts during the merge using:

@XVilka Yes, we could use the "ours" merge strategy to effectively ignore all changes that went into the latest stable release branch (e.g. v6) after it was pulled from the next branch. But, I don't see the benefit and think it would lead to contributor confusion.

Why main branch would confuse first-time contributors

The main branch (AKA "latest stable") would not be an active "thread of development". However, the most frequently followed convention is to have branches with names such as main be the "forward development branch".

Even if we lock the main branch using GitHub settings to disallow pull requests, many contributors will end up pulling a branch from main to do work. Then, they'll try to make a pull request from their branch to main but be prevented (by GitHub). They will then be confused why it does not work and need to learn how Capstone's branching strategy deviates from the norm.

In this case, the contributor will need to rebase their changes to the next branch. The rebase might have conflicts and thus the contributor may have wasted effort, but the biggest issue is that rebasing is a somewhat advanced topic and not a simple task for many users. This will sometimes require back and forth in discussions in the PR thread explaining how to git rebase (or equivalent) and make a new PR.

Even if the contributor wants to commit a bug fix to both next and the latest stable branch, bug fixes should go into next branch first ("always fix forward first").

How "ours" merge strategy makes history more confusing

Looking at git log main or the output of commands like git branch --contains 12345 could lead to the false conclusion that the main branch has a commit that it does not actually have since commits that went into "stable" branches (v5, v6, ...) would exist in the history but would have been discarded once the latest next branch is merged in.

Summary

In summary, I think we want to avoid violating the principle of least astonishment to save the time of both first-time contributors (doing wasted work) and long-time contributors (having to explain "please make a new PR").

We should delete/deprecate/lock the master/main branches and only use the next and "stable version branches" (v1, v2, v3, ...).

tmfink avatar May 07 '23 05:05 tmfink

@XVilka I think both of those strategies would be more complicated and require more work for all contributors than what I have described above.

@aquynh another option would be to let master be, but to make a new branch - main, as the default branch name for Git and GitHub these days, and base it on the latest next. Then, main would represent a "stable" branch, which would allow picking up small patches for making patch releases, while keeping next the default branch.

We need to have separate branches for each "release train" ("2.X", "3.X", ...). In fact, we already have branches v2, v3, ... for this exact purpose. It would be confusing and error prone to "juggle" maintaining the latest release branch (e.g. v4) and a new branch main. Both branches would need to have identical state--it makes to keep a main branch`.

It's also possible to make next default branch on the GitHub in this strategy, but base releases on main, merging next into main when you think it's ready for the next release. How does it sound?

Per my previous point, there's no benefit to having a main branch when we have a next branch and v* branches already.

Basically, my recommendation (since we have been doing it this way for years) is to have our next branch be our forward development branch (most projects use a main or master branch for this purpose). The only other branches we need are for stable v* version branches (v2, v3, v4, ...) that get pulled from next when we are getting ready for that release train.

Adding an additional main branch has no benefit. If we want users to be able to see what the latest stable release is, we can just have a Git tag called latest-stable that we update on every release like the Neovim project does as I described in the above comment.

tmfink avatar May 07 '23 06:05 tmfink

@tmfink for me it sounds like a viable strategy too, let's see what @aquynh and @kabeor think.

XVilka avatar May 07 '23 06:05 XVilka

We can rename "main" to "v5", does that work for you?

And we can make "next" the default branch.

aquynh avatar May 07 '23 11:05 aquynh

Perfect!

On Sun, May 7, 2023, 04:59 Nguyen Anh Quynh @.***> wrote:

We can rename "main" to "v5", does that work for you?

And we can make "next" the default branch.

— Reply to this email directly, view it on GitHub https://github.com/capstone-engine/capstone/issues/1781#issuecomment-1537422561, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIFROHSQ7574WGDLIIE2A3XE6FCXANCNFSM5C76XM3A . You are receiving this because you were mentioned.Message ID: @.***>

tmfink avatar May 07 '23 15:05 tmfink

I also agree that would work for us with the auto-sync effort perfectly. 👍

XVilka avatar May 08 '23 04:05 XVilka

ok i made the changes:

  • v5 is for stable release of version 5.
  • next becomes the default branch
  • main is removed

aquynh avatar May 08 '23 04:05 aquynh

Then it makes sense to close this one maybe?

XVilka avatar May 08 '23 04:05 XVilka

I like all of the changes.

I would say the only thing left is to:

  • lock the master branch (I can't do that)
  • add a wiki page documenting the branching (I can handle that)

tmfink avatar May 08 '23 06:05 tmfink

I created a wiki page describing the branching model.

tmfink avatar May 08 '23 06:05 tmfink