cluster-api-provider-azure
                                
                                 cluster-api-provider-azure copied to clipboard
                                
                                    cluster-api-provider-azure copied to clipboard
                            
                            
                            
                        Graduate AzureManagedCluster out of experimental
/kind feature
Describe the solution you'd like [A clear and concise description of what you want to happen.]
AzureManagedCluster (AKS) was introduced in CAPZ as an experiment in https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/482. Since then, a lot of progress has been made in adding more features and making the controllers and tests more stable. Part of the reason for building this feature as experimental in the first place was that it was a POC and we wanted to see if there would be any interest at all in managing AKS clusters through Cluster API. It's now abundantly clear that users want this. Through conversations during office hours, pull requests, and issues, we've learned that an increasing number of folks intend on using CAPZ to manage the lifecycle of their AKS clusters. Since we have no intention of deprecating the feature at this point, let's make that message clear to our users and move the AzureManagedCluster CRD to stable, remove the feature flag and enable it by default in the Azure provider.
There are still a number of open issues with regards to managed clusters, they can be found here. Some of these may need to be completed before we can call AKS stable, however it should not be a goal to complete every single issue before graduating AzureManagedCluster. The main goal of graduation is to remove the experimental policy so folks have confidence to use it in production without risk of the feature being deprecated without notice.
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
- cluster-api-provider-azure version:
- Kubernetes version: (use kubectl version):
- OS (e.g. from /etc/os-release):
/area managedclusters
Through conversations during office hours, pull requests, and issues, we've learned that an increasing number of folks intend on using CAPZ to manage the lifecycle of their AKS clusters.
Definitely agree; this matches my recent experiences. I've also been asked if AKS would support the AzureManagedCluster feature, and while of course the CAPZ project cannot make that determination, marking the feature as stable would definitely be a step in the direction those customers want.
I think https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/1503 / https://github.com/kubernetes-sigs/cluster-api/issues/4526 are key issues to fix to go stable.
The issue is the code here and here don't consider the full ARM ID for the VMSS instance. It can lead to CAPI confusing two providerIDs as being the same, leading to non-deterministic behavior matching nodes to providerIDs.
I think CAPI's "Equals" should just use the full provider ID, not parse tokens. but we should confirm with CAPA as well, which I think had some region strings which could differ internally. However if we match cloud provider we should always do the "right" thing.
+1 to moving it out of experimental, but since capz managed clusters cannot run without machine pool, should we wait for machine pool to be moved out of exp?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity, lifecycle/staleis applied
- After 30d of inactivity since lifecycle/stalewas applied,lifecycle/rottenis applied
- After 30d of inactivity since lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with /remove-lifecycle stale
- Mark this issue or PR as rotten with /lifecycle rotten
- Close this issue or PR with /close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/assign
@zmalik @luthermonson @karthikbalasub @michalno1 @NovemberZulu @dkoshkin Are folks interested in forming a working group to define a graduation criteria, and to contribute towards the various workstreams that will come out of that?
I'd love to accelerate our work on this as there are already lots of production AKS scenarios leveraging capz AzureManagedCluster, so to a certain extent we want to catch up to the real world!
cc @CecileRobertMichon
I am interested! I don't have much experience with this kind of work, but I am willing to help as much as I can.
Curious what folks in here think of https://docs.google.com/document/d/1dMN4-KppBkA51sxXPSQhYpqETp2AG_kHzByXTmznxFA/edit
Technically CAPZ doesn't need to make the same changes as CAPA does as the current AzureManagedCluster solution in theory should work with ClusterClass. However, if we want to be consistent with other managed cluster providers in CAPI in the longterm, it might be a good idea to apply the same refactor anyway. It would be a breaking change so if we do want to do it, it would make sense to do it before moving the feature out of experimental. Thoughts?
https://github.com/kubernetes-sigs/cluster-api/pull/6988 is close to merging FYI! 🎉
A brief status update:
https://github.com/kubernetes-sigs/cluster-api/pull/6988 has merged
A proposal that outlines graduation out of experimental is here:
- https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2602