community icon indicating copy to clipboard operation
community copied to clipboard

Weekly release schedule

Open mrocklin opened this issue 4 years ago • 55 comments

Currently we release dask/dask and dask/distributed every two weeks on Friday by loose convention (we're happy skipping or adding a release based on need and availability).

I'd like for us to consider increasing this frequency to weekly, still on Fridays. I've chatted with @jrbourbeau (who seems to be doing most of the releasing these days) and he seems game.

I thought I'd list some concerns that might arise that we should be aware of:

  1. This is an increased burden on maintainers
  2. Changes will have less time to marinate in master in order to find issues with them

I'm personally ok with these. I thought I'd bring it up in case other folks had thoughts or concerns.

cc @TomAugspurger @jakirkham @quasiben

mrocklin avatar Aug 07 '20 16:08 mrocklin

Whilst there is certainly some cost to having to explicitly decide to release, it also acts as a safeguard in times when we know there is instability.

Can you enumerate the advantages of a strict weekly cadence?

martindurant avatar Aug 07 '20 16:08 martindurant

I'm actually pretty happy with the 2 week cadence. Moving to 1 week makes it harder to do more complex changes.

jakirkham avatar Aug 07 '20 16:08 jakirkham

Whilst there is certainly some cost to having to explicitly decide to release, it also acts as a safeguard in times when we know there is instability.

Can you enumerate the advantages of a strict weekly cadence?

Ah, sorry, I didn't meant to imply strict at all. My intention was to suggest that we change the convention of "let's release roughly every couple of weeks on friday" to "let's release roughly every week on friday"

mrocklin avatar Aug 07 '20 17:08 mrocklin

I'm actually pretty happy with the 2 week cadence. Moving to 1 week makes it harder to do more complex changes.

I'm not sure I follow. Presumably complex changes would live in a PR until they're ready to merge, and then they would be released the following Friday. I do agree that we lose out on the extra marinade time, which is valuable for complex changes, but in those cases (which I think show up every few months) we can skip a week as needed.

mrocklin avatar Aug 07 '20 17:08 mrocklin

That can complicate testing and development when multiple people are involved and we need feedback from users. In particular I'm thinking about scheduler improvements, which will involved changes to Dask and Distributed.

jakirkham avatar Aug 07 '20 17:08 jakirkham

That can complicate testing and development when multiple people are involved and we need feedback from users. In particular I'm thinking about scheduler improvements, which will involved changes to Dask and Distributed.

I'm still not sure I understand. Are there scheduler improvements recently where this has been an issue?

And to be clear, I'm not at all proposing anything strict. Certainly if there was inconsistency then we wouldn't release. We're still able to exercise judgment when deciding whether or not to release.

mrocklin avatar Aug 07 '20 17:08 mrocklin

Can you enumerate the advantages of a strict weekly cadence?

To be clear, I'm in no way advocating for a strict cadence of any sort, but to provide more transparency I've written up my current situation in this issue:

https://github.com/dask/community/issues/85

As I work on Coiled, a managed Dask service, I'm trying to put as much into Dask as possible. Longer release cadences make co-development more awkward.

However, I also think that Dask has historically benefitted from having a shorter release cycle. People see their results in a released version sooner, and things feel generally more responsive. We used to release weekly for a while and it worked out pretty well. I think that we shifted to bi-weekly around the time that I stopped managing releases myself.

mrocklin avatar Aug 07 '20 17:08 mrocklin

I think there has been a lot of value gained from having a predictable release schedule from Dask (particularly as the project and community have matured). This has certainly made things easier engaging with stakeholders in and around RAPIDS. Would encourage that we keep some kind of schedule going forward (as opposed to going more ad-hoc or requesting last minute releases, which puts a lot of pressure on folks).

From the development side, I would hope that things like nightlies ( https://github.com/dask/community/issues/76 ) fill the gap between the regular release cycle and those needing the latest changes.

Maybe it would beneficial to set aside some time next week to get a better handle on what the use case is here and how we can better fill it?

jakirkham avatar Aug 07 '20 18:08 jakirkham

I think there has been a lot of value gained from having a predictable release schedule from Dask (particularly as the project and community have matured).

Agreed. I'm suggesting that we do what we're doing now, but with the convention moving from two weeks to one week. I think that we're in agreement here.

Would encourage that we keep some kind of schedule going forward (as opposed to going more ad-hoc or requesting last minute releases, which puts a lot of pressure on folks).

Yup. Same.

Maybe it would beneficial to set aside some time next week to get a better handle on what the use case is here and how we can better fill it?

I'm happy to chat about this, but I think that getting these thoughts down on Github is also useful. If you want to chat now for a few minutes I'm around. I'd be happy to meet up in whereby.com/dask-dev

mrocklin avatar Aug 07 '20 18:08 mrocklin

Some general thoughts.

This is an increased burden on maintainers

This process can be automated a little further, as it is in many other projects. Currently releases involve pushing to PyPI manually from a clean git repo. We could add Travis or GHA config to do this automatically when tags are pushed.

I've also had a pleasant experience with release drafter in other projects. It maintains a draft release for you with the tag automatically populated based on some rule (in Dask's case it would just be a minor bump most of the time). It also populates rich release notes from the PRs that have been merged since the last tag with links, author names and categories if PRs are tagged.

Changes will have less time to marinate in master in order to find issues with them

I'm always curious about how many folks are using master and updating regularly. We assume that if we merge a bad PR someone will raise a bug to alert us. But how confident are we about that?


People see their results in a released version sooner, and things feel generally more responsive.

I agree with this from a contributor/maintainer perspective. However in the past as a user in a restricted environment this has been very frustrating.

Dask releases frequently and only maintains a master branch. One of the results of this is that bug fix releases are rare because we can be confident that they will be released in the next couple of weeks anyway along with everything else. We only do a bug fix release if something really broken gets tagged.

In my experience of production workflows, dependencies are pinned to minor versions and updated infrequently. This is to help guarantee the stability and reproducibility of those workflows. An example might be a production system using a curated conda environment which gets major and minor updates every 6 months and only bug fixes in between.

In those environments release notes may also be checked by a human, and frequent releases can result in patching fatigue (I have been that human).

The result of this is that a bug can be introduced one week and fixed the next, but folks with less flexible environments may be stuck with that bug for a long time.

From past survey results we've seen that folks are pretty happy with Dask's stability, so I don't think this happens often or affects a large number of people. I think the stability is down to the CI and testing that we have which catches many problems before they are merged. But I think the only way we could improve that metric would be to be more thoughtful with our adherence to SemVer, backward compatibility and bug fix releases.

The counterpoint to that would be to move to CalVer. Moving to a weekly(ish) release cycle further reduces the usefulness of SemVer. Arguably is has little value today on our fortnightly cycle compared to projects with longer cycles and interim bugfix releases. However that switch would abandon the minority user group who care about pinning and bug fix releases.

jacobtomlinson avatar Aug 10 '20 14:08 jacobtomlinson

This is an increased burden on maintainers

I agree that this is true in general, but in this case the people who are doing the releases (James and myself) are asking for it, so I don't think that this is an issue in the case of Dask.

Regarding automation

We've talked about automation in the past. Releasing Dask isn't really that onerous, and I don't think that automated tools are likely to handle it well. When I release about 80% of the time is spent writing up the changelog. But we're not really clean enough about commit messages / PR titles that this can be automated yet. Most of the time here is spent in applying human judgement. It's also only like a 40m job, and one that I find an educational way of keeping up with the project.

One of the results of this is that bug fix releases are rare because we can be confident that they will be released in the next couple of weeks anyway along with everything else. We only do a bug fix release if something really broken gets tagged.

Issuing releases is cheap. If there is a significant issue we can always release. I think that we see this rarely because significant issues are themselves relatively rare, at least in my experience with the project.

In my experience of production workflows, dependencies are pinned to minor versions and updated infrequently. This is to help guarantee the stability and reproducibility of those workflows. An example might be a production system using a curated conda environment which gets major and minor updates every 6 months and only bug fixes in between.

This is my experience as well, but typically I find that people don't just pick a release, they make sure that things work for them, and then cement that in. In that context it seems like frequent releases is advantageous to me.

In those environments release notes may also be checked by a human, and frequent releases can result in patching fatigue (I have been that human).

Totally understood. I want to make it completely clear here that I'm not asking any of you to do any work. I'm asking you if you object to me doing more work.

mrocklin avatar Aug 11 '20 19:08 mrocklin

My general sense of this conversation is that folks are engaging in a discussion about releasing procedures generally while I'm trying to talk about Dask in particular. In general I agree with all of your points about the potential costs and challenges of releasing. However, pragmatically, I don't think that those concerns apply in this specific case. I think that this is especially true given ...

  1. I or James tend to do the work here
  2. We haven't proposed any strict policies, or any strict requirements
  3. This is the way that we've done things for a long while without any issue arising

So if possible I'd love to move the conversation about releasing generally elsewhere, and ask a more pragmatic question: "Does anyone have specific qualms about performing more frequent releases when everything seems like it's fine to release, assuming that they are not expected to do any work?"

mrocklin avatar Aug 11 '20 19:08 mrocklin

You could have phrased this as "anyone objecting to a release this week?" a couple of times, followed by "hey, we've released weekly for the past three weeks, has this had any negative effect?" On that basis, I would say go ahead. If we run out of willing releasers, we can always reduce the frequency, since this isn't a public commitment.

martindurant avatar Aug 11 '20 19:08 martindurant

You could have phrased this as "anyone objecting to a release this week?" a couple of times, followed by "hey, we've released weekly for the past three weeks, has this had any negative effect?"

Good thought. I tried that a bit here but didn't get any response: https://github.com/dask/community/issues/85 . I probably should have assumed lazy consensus and gone ahead.

On that basis, I would say go ahead. If we run out of willing releasers, we can always reduce the frequency, since this isn't a public commitment.

That's good to hear. Thank you for the comment @martindurant

mrocklin avatar Aug 11 '20 19:08 mrocklin

I don't see any clear benefits to moving to weekly, and this would make things more difficult for RAPIDS in particular.

Also, given our cadence of testing, some bugs take 4-5 business days to realize as we have a few large-scale Dask tests that require 100+ GPU. Releasing weekly gives us less time to catch these errors.

datametrician avatar Aug 12 '20 03:08 datametrician

Thanks for chiming in Josh. Can I ask you to expand on why this would make things harder for RAPIDS?

On Tue, Aug 11, 2020, 8:14 PM Joshua Patterson [email protected] wrote:

I don't see any clear benefits to moving to weekly, and this would make things more difficult for RAPIDS in particular.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/84#issuecomment-672544912, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTB2DS7QLJE2YV2UV7TSAICILANCNFSM4PXZOAPA .

mrocklin avatar Aug 12 '20 03:08 mrocklin

I think what Josh tried to point out is that it's not uncommon for us to find bugs from doing scale testing / benchmarking that get introduced in the middle of a release cycle. I think the pushback from our side is that the ~2 week cycle has generally been a sweet spot for us to find bugs introduced on master and either fix them or sound the alarm before a release goes out. With a 1 week release cycle, we're not confident we could react in time to push a fix or sound the alarm. If a release goes out and RAPIDS is in a broken state with regards to Dask, we'd need to patch our Dask dependency pinning to not allow that released version so developers / users aren't left in a broken state. This is a somewhat non-trivial amount of maintenance burden for us to guarantee that the experience for RAPIDS + Dask is as smooth as possible.

kkraus14 avatar Aug 12 '20 04:08 kkraus14

Do you often do this today? For example if we merged something in today and then released on Friday (which is our current convention) then you also won't have much time to respond. This lack of response time has been common historically (we don't have code freeze or anything like that).

On Tue, Aug 11, 2020, 9:09 PM Keith Kraus [email protected] wrote:

I think what Josh tried to point out is that it's not uncommon for us to find bugs from doing scale testing / benchmarking that get introduced in the middle of a release cycle. I think the pushback from our side is that the ~2 week cycle has generally been a sweet spot for us to find bugs introduced on master and either fix them or sound the alarm before a release goes out. With a 1 week release cycle, we're not confident we could react in time to push a fix or sound the alarm. If a release goes out and RAPIDS is in a broken state with regards to Dask, we'd need to patch our Dask dependency pinning to not allow that released version so developers / users aren't left in a broken state. This is a somewhat non-trivial amount of maintenance burden for us to guarantee that the experience for RAPIDS + Dask is as smooth as possible.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/84#issuecomment-672560819, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTFXFOROIWWSFU4E6NDSAIIXFANCNFSM4PXZOAPA .

mrocklin avatar Aug 12 '20 04:08 mrocklin

We've talked about automation in the past.

I agree things are not onerous. My only point was that I personally find the twine step a little scary as your local checkout has to be clean. In my experience this step can be error prone. I always prefer to have CI perform this step once the tag has been created as you can be confident that checkout will be clean.

Issuing releases is cheap.

As maintainers yes. But for some organisations changing your pinned version is not cheap.

I want to make it completely clear here that I'm not asking any of you to do any work. I'm asking you if you object to me doing more work.

Don't worry that isn't how I took things. My point was that if you take on more work it will increase work for a portion of our users. Their increase will likely be much larger than yours.

As @datametrician and @kkraus14 mentioned this will also likely increase work for groups like RAPIDS. And again that increase will likely be larger than yours.


My comments above only apply to a small portion of our users. Folks who say in our surveys that they struggle to update to new Python versions, Dask versions or do not find things stable today. The point I was trying to make is that increasing frequency will likely make things worse for that minority group. I also empathise a lot with that group because I was once in it.

However I also aware that considering minorities of users can increase the maintenance burden significantly. Browser support is a good example of this. It is impractical to support all broswers, there are a lot of them. In past projects I've worked on we've made the decision to support all browsers which have a greater than 5% market share. The result of this is that we close issues raised by Opera, Brave and Vivaldi users with an apology. I think making a decision to not support a minority user group is fine, provided it is discussed and documented.


Given that some folks will likely see an increase in work as a result of increasing the frequency it may be good to try and quantify the folks who will benefit from this change in order to make a decision on this.

From my reading of this thread that would be the following.

  • Contributors. When folks raise a PR, see it merged quickly and then released quickly they feel good. When contributors feel good they contribute more which benefits the project.
  • Downstream projects. Projects who build on Dask and want to iterate quickly (like Coiled) will benefit from making upstream changes quickly and getting access to them.

Are there more?

jacobtomlinson avatar Aug 12 '20 09:08 jacobtomlinson

What I see being experienced by organizations who are critically dependent on Dask is that there is some tension around what to do when there is a strong desire to release a new version. This happened recently at NVIDIA, and as @jakirkham mentioned, he opened #76 and we think that might be a good way forward. However, the solution to NVIDIA's particular problem that week could have been an unexpected release.

Rather than consider a loose weekly release schedule what if lower the barrier to asking for an unexpected release. We could still keep the two release schedule but if an org tied to a maintainer asks for release, the assumption is yes for now.

quasiben avatar Aug 12 '20 11:08 quasiben

if an org tied to a maintainer asks for release, the assumption is yes for now

I think we should have a bigger pool of active releasers for this to be practical

martindurant avatar Aug 12 '20 12:08 martindurant

Folks who say in our surveys that they struggle to update to new Python versions, Dask versions or do not find things stable today. The point I was trying to make is that increasing frequency will likely make things worse for that minority group. I also empathise a lot with that group because I was once in it.

Could we get survey results on stability and upgrade-ability so we can use more people's feedback?

datametrician avatar Aug 12 '20 13:08 datametrician

Do you often do this today? For example if we merged something in today and then released on Friday (which is our current convention) then you also won't have much time to respond. This lack of response time has been common historically (we don't have code freeze or anything like that). On Tue, Aug 11, 2020, 9:09 PM Keith Kraus @.***> wrote: I think what Josh tried to point out is that it's not uncommon for us to find bugs from doing scale testing / benchmarking that get introduced in the middle of a release cycle. I think the pushback from our side is that the ~2 week cycle has generally been a sweet spot for us to find bugs introduced on master and either fix them or sound the alarm before a release goes out. With a 1 week release cycle, we're not confident we could react in time to push a fix or sound the alarm. If a release goes out and RAPIDS is in a broken state with regards to Dask, we'd need to patch our Dask dependency pinning to not allow that released version so developers / users aren't left in a broken state. This is a somewhat non-trivial amount of maintenance burden for us to guarantee that the experience for RAPIDS + Dask is as smooth as possible.

Great so we change the probability of not catching a merge issue from 30% to 60% (assuming 3 days before a release is too soon to catch something).

I personally feel a code freeze for 3 days would be amazing before releasing a new version. Even pushing release a week out further to every 3 weeks. Both of those help production systems stay up to date and catch issues. Weekly releases is working against that...

Like Ben, if a maintainer org needs to a release, I'm ok with a short release. RAPIDS does "hotfixes" today (rarely, but on occasion). Numba, XGBoost, and CuPy also both do RC and code freeze. I feel like instead of going opposite the community we could harmonize a bit more.

datametrician avatar Aug 12 '20 13:08 datametrician

a code freeze for 3 days

You would, of course, need a group of testers that are running the RC/freeze, otherwise this doesn't help. (pointing out the obvious, sorry)

martindurant avatar Aug 12 '20 13:08 martindurant

a code freeze for 3 days

You would, of course, need a group of testers that are running the RC/freeze, otherwise this doesn't help. (pointing out the obvious, sorry)

RAPIDS is willing and ready for the parts of Dask we use.

datametrician avatar Aug 12 '20 13:08 datametrician

Could we get survey results on stability and upgrade-ability so we can use more people's feedback?

The survey is still open but closing soon. I think @TomAugspurger offered to compile the results and write a blog post.

Premilenary results to your questions @datametrician are

image

image

One is easy, four is hard.

jacobtomlinson avatar Aug 12 '20 13:08 jacobtomlinson

I agree things are not onerous. My only point was that I personally find the twine step a little scary as your local checkout has to be clean.

I agree with you in general, but in practice has this ever occurred with Dask? I think that our practices are pretty clean.

In my experience this step can be error prone. I always prefer to have CI perform this step once the tag has been created as you can be confident that checkout will be clean.

Same. Agree in general. I don't think that there is any evidence of this in Dask.

Rather than consider a loose weekly release schedule what if lower the barrier to asking for an unexpected release

This seems sensible to me. Although to be clear it's also our current policy. Releasing ad-hoc if things are fine has been the standard for a long while. I was mostly raising these issues as a courtesy, which is partially why I'm surprised by all of the pushback.

Great so we change the probability of not catching a merge issue from 30% to 60% (assuming 3 days before a release is too soon to catch something).

Where are these numbers coming from?

I personally feel a code freeze for 3 days would be amazing before releasing a new version.

After seeing the amount of stress that goes into the RAPIDS release process I'm currently -1 on a formal code freeze.

mrocklin avatar Aug 12 '20 13:08 mrocklin

In general I agree with everything that everyone is saying. However in practice I don't think we ever see these issues with Dask releases. We're pretty good about exercising human judgement, only releasing when things are clean, holding off on merging disruptive PRs just before a release, and so on.

Can I ask folks here to bring up a few concrete cases where a frequent release cycle hurt Dask users?

mrocklin avatar Aug 12 '20 13:08 mrocklin

Given that ~20% of users do not think Dask is stable enough and ~10% of users struggle to upgrade their package versions I suggest that these are significant enough groups that we should try and improve stability.

I think a move to weekly releases is a step away from stability. Adding a short code freeze and extending to three weekly releases would move us towards stability.

I also agree with @quasiben that core maintainers should be able to request interim releases for their own benefit.

jacobtomlinson avatar Aug 12 '20 13:08 jacobtomlinson

Given that ~20% of users do not think Dask is stable enough and ~10% of users struggle to upgrade their package versions I suggest that these are significant enough groups that we should try and improve things here.

My guess is that Dask not being stable enough has a lot more to do with bugs, and less to do with frequent releases. My guess is users that struggle to upgrade their package versions has more to do with "we're still on a pandas version from 2016" than "please don't release frequently"

mrocklin avatar Aug 12 '20 13:08 mrocklin