rustup icon indicating copy to clipboard operation
rustup copied to clipboard

Document the new release process for Rustup

Open jdno opened this issue 1 year ago • 6 comments

The Rustup release process has historically been a manual process that involved copying files from S3 to the local machine and back to S3. This introduced a high risk of human error. When modifications to the existing release script became necessary, the decision was made to automate the release process (see #3819 for details).

The documentation in the dev-guide has been updated to cover the new release process, which is fully automated to produce beta releases using GitHub Actions and the promote-release tooling.

jdno avatar May 28 '24 11:05 jdno

Here's a rendered version of the file: https://github.com/jdno/rust-rustup/blob/new-release-process/doc/dev-guide/src/release-process.md

jdno avatar May 28 '24 11:05 jdno

Thanks for working on this!

djc avatar Jun 11 '24 09:06 djc

@jdno what's your plan for driving the new release process to completion? We're starting to think about how/when we want to publish our next release so it would be good to understand how much of a dependency we have on you and what that means for the schedule.

djc avatar Jun 21 '24 12:06 djc

I've started working on it, but I can't give a good estimate on how long it'll take. My hope is to get most of the implementation work done in the next two weeks.

I don't want to be a blocker, though, so my work is strictly additive right now. The current process will continue to work until we have a full replacement. So regarding dependencies or the schedule, there shouldn't be any.

jdno avatar Jun 21 '24 13:06 jdno

I'd strongly recommend adding roll back process for when a deployment causes failure like what happened in 1.28.0.

I'd like to echo and amplify the comments in https://github.com/rust-lang/rustup/issues/4211 about rolling back to make sure that there's more than one voice calling for this practice. The 1.28.0 release highlighted that Rustup is a load bearing piece of software. Having a failed deployment cause many downstream failures like this should have resulted in a rollback within about an hour, not a roll forward a day later.

I base this opinion having participated (as an owner and as someone affected by) in many post-mortems at a large internet company. The consensus engineering best practice is to rollback first, and think about how to solve the greater problem second. It's rare that this advice should be ignored, and there doesn't seem like there's a any good rationale that's been expressed here why that should have been so here. The users that migrated to using the new rustup changes would generally have been a very small number compared to those who were affected by this, and those users were actively aware of the changes. The impacted users were generally not made aware until this caused failure in their systems.

I want to explicitly state this comment is not intended to throw shade at all on anyone involved in 1.28. It is only meant to constructively improve the situation going forward.

joshka avatar Mar 06 '25 21:03 joshka

I'd strongly recommend adding roll back process for when a deployment causes failure like what happened in 1.28.0.

@joshka I second this. In fact, I have already mentioned it on Zulip:

by the way, it’s not urgent, but can we have /archive/dist on our release server redirect to /archive or something? Anyway the goal is to trick rustup into thinking the archive is an actual release server, to provide a way for arbitrary rustup downgrade/upgrade/pinning without having to adapt the code on our side

I added that rustup could be modified to make use of the /archive path, however this modification itself would also need a new release to work.

Let's leave this thread for the release-related work: I've made https://github.com/rust-lang/rustup/issues/4240 for the rustup-related part of this problem.

rami3l avatar Mar 07 '25 09:03 rami3l