community
community copied to clipboard
Revisiting criteria and process for Alpha/Beta/GA
As part of the ongoing effort to improve consistency, quality and reliability, we have been adding new gates along the feature development path, such as writing a KEP, API review, production readiness review, conformance test requirements, etc. Because of this, we need to revisit our guidance for graduation criteria and perhaps the feature release process. The KEP template has a section for graduation criteria, and we have this set of guidance, but that's embedded in the doc about changing APIs and needs a broader focus.
As for feature release process, we have the KEP process. A quick check though shows our current process is not working as well as I think we'd like. Either that or there's a lot less going on that we think. Of about 133 KEPs, only 13 show as actually "implemented":
jbelamaric@jbelamaric:~/proj/gh/kubernetes/enhancements/keps$ grep -r status: sig-*/[0-9]*.md | wc -l
133
jbelamaric@jbelamaric:~/proj/gh/kubernetes/enhancements/keps$ grep -r status: sig-*/[0-9]*.md | cut -d: -f 3 | tr -d ' ' | sort | uniq -c
4
2 "False"
79 implementable
13 implemented
1 implemented)
1 proposal).
32 provisional
1 "True"
jbelamaric@jbelamaric:~/proj/gh/kubernetes/enhancements/keps$
When they are left in "provisional" and "implementable" state, even though at least some are certainly released in at least alpha or beta, it makes me wonder if we are properly reviewing the KEP graduation criteria as we merge/promote things. Maybe we are, but it's not clear, and I worry that we're not being very effective at reviewing the features and making sure that all the "i"s are dotted and "t"s are crossed.
So, the discussion I am trying to start is this:
- Do we need a clearer, more formal feature review process (presumably with the KEP as the primary artifact)?
- If so, who runs that? SIG-release? SIG-arch? Is it done by each SIG independently?
- Do we need clearer guidance for alpha/beta/GA that can then be used during that process to approve feature promotions?
- Who approves those?
As a starting point, some of the criteria for graduation at different levels could include:
- KEP review
- API review
- Completeness
- Production readiness review (and the criteria in there such as having playbooks, documented failure modes, etc.)
- Feature gating
- Level of unit and e2e test coverage
- Inclusion of conformance tests
- Inclusion of scalability tests
- Completeness and quality of documentation
- Upgrade / downgrade tests - Real world usage statistics
/sig architecture pm release /priority important-soon
cc: @kubernetes/sig-pm @kubernetes/sig-release @kubernetes/release-team
/cc
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale /lifecycle frozen
New policy for avoiding perma-betas has merged: https://github.com/kubernetes/enhancements/pull/1266
Along these lines we need:
- something similar for alpha APIs
- something similar for feature gates not just api versions, for both alpha and beta
- possibly some tools / specific criteria around what "sufficient feedback" is
We need to make sure that if we up the bar for beta we do not push things to be left in alpha which is worse - and we found in the PRR survey that a shocking 57% of orgs with 101-1000 nodes under management had enabled an alpha feature in production.
/remove-sig pm /area enhancements
Adding discussion notes raised over various Release Team (1.19) and Enhancements subproject meetings since June 2020
Process improvements and clarification needs
General
- [ ] AI: Need a clear workflow so that we can build tooling around it
- Our guidance (heuristics, goals) for graduation criteria; clarity would help reinforce neutrality when people/companies say, “this must go out now”
- [ ] Q: KEP template has a section about it — revise?
- [ ] Q: How does the Enhancements team enforce the criteria? Doesn't have much leverage in preventing perma-betas, rush jobs
- [ ] Q: Are statuses official? Are the definitions binding?
- [ ] AI: Settle questions about KEP ownership
- [ ] Q: Who's the assignee of the KEP? Often unclear to the enhancements team at large (especially new folx) if the person responding has SIG authority to speak about KEP status/release goals
- The feature release process itself
- Creates onboarding challenge for Release Team: What they should track, for example
Alpha-related
- Especially freeform compared to Beta, GA. Easy to get something in, if code works
- SIGs often get the code merged and the rest falls by the wayside
- SIGs might not vet whether they will have bandwidth to review the PRs
- [ ] Q: Is it required to only have unit testing or e2e testing for Alpha?
- [ ] Q: Companies sometimes want to push to Beta to increase engagement. How to manage this?
Beta-related
- Release Team loses leverage once the code is merged and somehow regaining that leverage to prevent perma-betas/not-updated merged enhancements is going to be important for the maturity of the project
- Use of the keps-beta milestone
- [ ] Q: is information on this clear, documented, consistent?
Hey, this issue is marked as Frozen, but has had no activity since 2020.
I'm going to close it if I don't hear from anyone in a week.