community Revisiting criteria and process for Alpha/Beta/GA

As part of the ongoing effort to improve consistency, quality and reliability, we have been adding new gates along the feature development path, such as writing a KEP, API review, production readiness review, conformance test requirements, etc. Because of this, we need to revisit our guidance for graduation criteria and perhaps the feature release process. The KEP template has a section for graduation criteria, and we have this set of guidance, but that's embedded in the doc about changing APIs and needs a broader focus.

As for feature release process, we have the KEP process. A quick check though shows our current process is not working as well as I think we'd like. Either that or there's a lot less going on that we think. Of about 133 KEPs, only 13 show as actually "implemented":

jbelamaric@jbelamaric:~/proj/gh/kubernetes/enhancements/keps$ grep -r status: sig-*/[0-9]*.md | wc -l
133
jbelamaric@jbelamaric:~/proj/gh/kubernetes/enhancements/keps$ grep -r status: sig-*/[0-9]*.md | cut -d: -f 3 | tr -d ' ' | sort | uniq -c
      4 
      2 "False"
     79 implementable
     13 implemented
      1 implemented)
      1 proposal).
     32 provisional
      1 "True"
jbelamaric@jbelamaric:~/proj/gh/kubernetes/enhancements/keps$

When they are left in "provisional" and "implementable" state, even though at least some are certainly released in at least alpha or beta, it makes me wonder if we are properly reviewing the KEP graduation criteria as we merge/promote things. Maybe we are, but it's not clear, and I worry that we're not being very effective at reviewing the features and making sure that all the "i"s are dotted and "t"s are crossed.

So, the discussion I am trying to start is this:

Do we need a clearer, more formal feature review process (presumably with the KEP as the primary artifact)?
- If so, who runs that? SIG-release? SIG-arch? Is it done by each SIG independently?
Do we need clearer guidance for alpha/beta/GA that can then be used during that process to approve feature promotions?
- Who approves those?

As a starting point, some of the criteria for graduation at different levels could include:

KEP review
API review
Completeness
Production readiness review (and the criteria in there such as having playbooks, documented failure modes, etc.)
Feature gating
Level of unit and e2e test coverage
Inclusion of conformance tests
Inclusion of scalability tests
Completeness and quality of documentation
Upgrade / downgrade tests - Real world usage statistics

Aug 14 '19 01:08 johnbelamaric

/sig architecture pm release /priority important-soon

cc: @kubernetes/sig-pm @kubernetes/sig-release @kubernetes/release-team

Aug 14 '19 01:08 justaugustus

/cc

Aug 14 '19 19:08 fedebongio

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

Nov 12 '19 20:11 fejta-bot

/remove-lifecycle stale /lifecycle frozen

Nov 12 '19 20:11 johnbelamaric

New policy for avoiding perma-betas has merged: https://github.com/kubernetes/enhancements/pull/1266

Along these lines we need:

something similar for alpha APIs
something similar for feature gates not just api versions, for both alpha and beta
possibly some tools / specific criteria around what "sufficient feedback" is

We need to make sure that if we up the bar for beta we do not push things to be left in alpha which is worse - and we found in the PRR survey that a shocking 57% of orgs with 101-1000 nodes under management had enabled an alpha feature in production.

Mar 27 '20 18:03 johnbelamaric

/remove-sig pm /area enhancements

Apr 16 '20 00:04 justaugustus

Adding discussion notes raised over various Release Team (1.19) and Enhancements subproject meetings since June 2020

Process improvements and clarification needs

General

[ ] AI: Need a clear workflow so that we can build tooling around it
Our guidance (heuristics, goals) for graduation criteria; clarity would help reinforce neutrality when people/companies say, “this must go out now”
- [ ] Q: KEP template has a section about it — revise?
- [ ] Q: How does the Enhancements team enforce the criteria? Doesn't have much leverage in preventing perma-betas, rush jobs
- [ ] Q: Are statuses official? Are the definitions binding?
[ ] AI: Settle questions about KEP ownership
- [ ] Q: Who's the assignee of the KEP? Often unclear to the enhancements team at large (especially new folx) if the person responding has SIG authority to speak about KEP status/release goals
The feature release process itself
- Creates onboarding challenge for Release Team: What they should track, for example

Alpha-related

Especially freeform compared to Beta, GA. Easy to get something in, if code works
SIGs often get the code merged and the rest falls by the wayside
SIGs might not vet whether they will have bandwidth to review the PRs
[ ] Q: Is it required to only have unit testing or e2e testing for Alpha?
[ ] Q: Companies sometimes want to push to Beta to increase engagement. How to manage this?

Beta-related

Release Team loses leverage once the code is merged and somehow regaining that leverage to prevent perma-betas/not-updated merged enhancements is going to be important for the maturity of the project
Use of the keps-beta milestone
- [ ] Q: is information on this clear, documented, consistent?

Nov 14 '20 12:11 lasomethingsomething

Hey, this issue is marked as Frozen, but has had no activity since 2020.

I'm going to close it if I don't hear from anyone in a week.

May 08 '23 22:05 jberkus

community community copied to clipboard

Revisiting criteria and process for Alpha/Beta/GA

Process improvements and clarification needs

General

Alpha-related

Beta-related

community
community copied to clipboard