argo-workflows icon indicating copy to clipboard operation
argo-workflows copied to clipboard

Document the process for release rotation

Open terrytangyuan opened this issue 1 year ago • 20 comments

Summary

Purpose: get more people involved in releases and improve the overall process.


Message from the maintainers:

Love this enhancement proposal? Give it a 👍. We prioritize the proposals with the most 👍.

terrytangyuan avatar Jan 30 '24 21:01 terrytangyuan

Adding teammates of mine for visibility @tico24 @Joibel @isubasinghe

We're happy to support on the release effort. 👍

caelan-io avatar Jan 31 '24 19:01 caelan-io

Since we have two additional approvers, I would suggest one of you (@agilgur5 @isubasinghe) can try follow the instructions in https://github.com/argoproj/argo-workflows/blob/main/docs/releasing.md and see what is missing so we can improve the docs. I don't think we need any separate documentation for this. WDYT? Any volunteers?

terrytangyuan avatar Feb 14 '24 01:02 terrytangyuan

For others without write access yet, you can still send PRs to release branch to help resolve any conflicts and then other approvers can review.

terrytangyuan avatar Feb 14 '24 01:02 terrytangyuan

Since we have two additional approvers, I would suggest one of you (@agilgur5 @isubasinghe) can try follow the instructions in https://github.com/argoproj/argo-workflows/blob/main/docs/releasing.md and see what is missing so we can improve the docs. I don't think we need any separate documentation for this. WDYT? Any volunteers?

Sure, I can get to this on Friday unless @agilgur5 beats me to it.

isubasinghe avatar Feb 14 '24 08:02 isubasinghe

Hmm probably should note in the docs that there was an implicit alias in the previous releases.

It would be nice if the "true" "false" options for the script itself were documented, I had to look into the script itself to figure out what it was doing.

If I am correct the new commits for v3.3.5 should be based upon the HEAD of 3.3 instead of 3.3.4 ? I find that a bit confusing.

This release process seems like quite a bit of work, wonder if we can automate some of this effort.

isubasinghe avatar Feb 16 '24 01:02 isubasinghe

If I am correct the new commits for v3.3.5 should be based upon the HEAD of 3.3 instead of 3.3.4 ?

Yes, see the top of the document: "Please make sure that all patch releases (e.g. v3.3.5) should be released from their associated minor release branches (e.g. release-3.3) to work well with our versioned website."

It would be nice if the "true" "false" options for the script itself were documented, I had to look into the script itself to figure out what it was doing.

Well, the document mentions "get a list of commits you may want to cherry-pick" and "to automatically cherry-pick" with two separate code blocks already but maybe explicitly call out the flag would be helpful.

This release process seems like quite a bit of work, wonder if we can automate some of this effort.

Feel free to propose any improvements.

terrytangyuan avatar Feb 16 '24 01:02 terrytangyuan

Yes, see the top of the document: "Please make sure that all patch releases (e.g. v3.3.5) should be released from their associated minor release branches (e.g. release-3.3) to work well with our versioned website."

Yeah I saw this, but what I was trying to say is that I would like it to be even more explicit, just to remove any confusion. I guess I want to understand " to work well with our versioned website" this in more depth as well.

but maybe explicitly call out the flag would be helpful.

Yeah I think that might be nicer.

Feel free to propose any improvements.

I am having a look into it now, there are some tooling around this issue it seems like, will report back after I find out more.

isubasinghe avatar Feb 16 '24 01:02 isubasinghe

I guess I want to understand " to work well with our versioned website"

The release-3.4 and release-3.5 branches are visible on the versioned docs site. So if you update the branch, you'll update the docs site as well. Then you can just tag off the branch.

agilgur5 avatar Feb 16 '24 04:02 agilgur5

This release process seems like quite a bit of work, wonder if we can automate some of this effort.

Feel free to propose any improvements.

If there are merge conflicts, then ostensibly no, those can't be automated. The main part I had proposed back in the Slack thread was to have something similar to CD's cherry-pick bot so that we can cherry-pick things as they come in instead of in batches. That way the context of the PR remains when cherry-picked and merge conflicts can be fixed quicker and potentially with the author even.

Ideally the bot (or other automation) would try to clean cherry-pick and if it works, cool, done. If not, it could either open a PR with the conflict or write a message on the original PR that there was a conflict and so manual resolution is needed.

agilgur5 avatar Feb 16 '24 04:02 agilgur5

The main part I had proposed back in the Slack thread was to have something similar to CD's cherry-pick bot so that we can cherry-pick things as they come in instead of in batches.

Yeah this is exactly part of what I was thinking of as well.

isubasinghe avatar Feb 16 '24 04:02 isubasinghe

Here's the PR in CD that added the cherry-pick-bot: https://github.com/argoproj/argo-cd/pull/12591

Unfortunately that one creates a PR for every cherry-pick, so it creates a lot of duplicate PR noise. I would prefer to avoid that, especially as it makes the repo history much harder to search through with all the dupes

agilgur5 avatar Feb 16 '24 04:02 agilgur5

We talked about this briefly in last week's Contributor Meeting, where I mentioned a replacement for the bot with a GH Action, e.g. https://github.com/vendoo/gha-cherry-pick. That will suffice for most of our needs, but the one problem with it is that it won't trigger CI after a cherry-pick since GH intentionally prevents actions from triggering each other to avoid infinite loops. We might be able to workaround that by manually dispatching GHA Workflows after the cherry-pick, but then we'd have to manually list every GHA Workflow that needs to be run (since you can't just run all of them, as far as I know).

Thinking about it a bit more though, we probably aren't running CI/tests on each cherry-pick when doing it manually / locally anyway. Similarly, as I learned in that contributor meeting, CI cancels itself on a commit when another commit is made on the branch before it's done (i.e. it only runs one CI job at a time on a branch, on the latest commit). So this is perhaps already a better option than manual / local without cons (unlike the bot). Perhaps we just want to make sure we run CI and that it passes on the release branch before making the release / tagging off the branch?

agilgur5 avatar Feb 25 '24 16:02 agilgur5

+1 on the use of automation to cherry pick commits, instead of doing all at the time of cutting a release.

csantanapr avatar Mar 04 '24 21:03 csantanapr

That will suffice for most of our needs, but the one problem with it is that it won't trigger CI after a cherry-pick since GH intentionally prevents actions from triggering each other to avoid infinite loops.

An interim workaround would just be for Approvers to manually cherry-pick fixes into the ongoing release branch (i.e. release-3.5, release-3.4) in their local and manually push them. This would solve the time delay, though is fairly manual and not explicit.

agilgur5 avatar Mar 06 '24 03:03 agilgur5

For others without write access yet, you can still send PRs to release branch to help resolve any conflicts and then other approvers can review.

Regarding those without write access, we did discuss this in the previous Contributor Meeting and I had previously proposed on Slack giving temporary write permissions to Member+ on release rotation. That proposal was rejected, and merging PRs for an entire release (i.e. with multiple commits) is unfortunately not necessarily possible due to automated DCO issues (c.f. https://github.com/argoproj/argo-workflows/pull/12462#issuecomment-1877572919 and more recently #12711).

What we could do though is a "manual merge" -- a contributor writes up a PR to a release branch that fixes all conflicts, then, once approved, an Approver pulls those locally and pushes them to the release branch. That process for those without write access is actually fairly neat, all things considered. Before GH supported rebase merges and squash merges, I and others used to do this in other repos and would leave a comment on close as to how things were merged (random examples: https://github.com/agilgur5/react-signature-canvas/pull/3#issuecomment-303884454, https://github.com/django/django/pull/7762#issuecomment-269807584). This would effectively be a rebase merge as well.

agilgur5 avatar Mar 06 '24 03:03 agilgur5

@terrytangyuan I'm not sure this has been completed? We're most certainly still iterating on it. Which comes before even documenting it.

We discussed it in the April 2nd Contributor Meeting as well, where @caelan-io said Pipekit would work on some improvements.


As an update from my end, I have been trying to follow the interim workaround I mentioned above for release-3.5. Despite that, I am still behind; I have all the CVE/deps patches, but am behind about ~20 fixes (although we did have a lot recently). And my current efforts suggest that a /cherry-pick action may not be that helpful, as I would say roughly half, if not more, of commits have merge conflicts when backported.

For deps, a chunk of that is due to selective backports, since one dep change can affect a few transitive deps and so they tend to be intertwined. I've tried going through the history and cherry-picking more deps to mitigate that with some success, but those aren't always CVE fixes. That has decreased since #12487 though, many of the conflicts were because Dependabot added an extraneous non-security update (that wasn't backported earlier due to its extraneous non-security nature).

My current thought process is that if it will be in part, substantially manual for the foreseeable future, we may want to have "release managers" for a given branch or set of patches at least. I.e. they are on duty to watch for and actively cherry-pick things and can decide on the patch release schedule as they choose as well -- one person who is primarily responsible. That is a common practice in other projects. Any automation would be tools to help the release manager, but cannot fully automate merge conflicts resolution.

agilgur5 avatar Apr 18 '24 16:04 agilgur5

Sorry I meant to close another issue. Thanks for catching it.

terrytangyuan avatar Apr 18 '24 16:04 terrytangyuan

and can decide on the patch release schedule as they choose as well

Alternatively, we just cut a patch release on a rolling schedule, e.g. every 2 weeks. Anything that's already backported is released, anything else has to wait till the next at least. That could be automated. And that's potentially easier to follow / more straightforward / less confusing for users and release managers. Users could contribute backports if they want a specific one out earlier.

agilgur5 avatar Apr 18 '24 16:04 agilgur5

Agreed, let's keep open. We want to figure out automation for this and have a few ideas.

Monthly release cadence is about the max we have capacity to manage right now given how manual the cherry picking and merge conflict resolution is (based on my understanding). Just a heads up on that front for expectations.

Caelan Co-founder, CEO @ Pipekit.io ( https://pipekit.io/ ) LinkedIn ( https://www.linkedin.com/in/caelan-urquhart/ ) | GitHub ( https://github.com/caelan-io )

On Thu, Apr 18, 2024 at 5:28 PM, Anton Gilgur < @.*** > wrote:

and can decide on the patch release schedule as they choose as well

Alternatively, we just cut a release on a rolling schedule, e.g. every 2 weeks. Anything that gets in gets in, anything else has to wait till the next at least. That could be automated. And that's potentially easier to follow / more straightforward / less confusing for users and release managers.

— Reply to this email directly, view it on GitHub ( https://github.com/argoproj/argo-workflows/issues/12592#issuecomment-2064456047 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AK5HJBVY5FHWK7ZW76I7NITY57YERAVCNFSM6AAAAABCR57GNGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRUGQ2TMMBUG4 ). You are receiving this because you were mentioned. Message ID: <argoproj/argo-workflows/issues/12592/2064456047 @ github. com>

caelan-io avatar Apr 18 '24 22:04 caelan-io

Monthly release cadence is about the max

"every 2 weeks" was if we go with the "release manager" approach I mentioned above who's just on backporting duty for a set period of time (on a rotating basis). The frequency of releases is then independent as backports do not happen at a specific "release date", but are happening constantly simultaneously as main is being developed. Releasing itself is mostly automated, backporting is not.

agilgur5 avatar Apr 19 '24 02:04 agilgur5