cluster-api-provider-azure Graduate AzureManagedCluster out of experimental

/kind feature

Describe the solution you'd like [A clear and concise description of what you want to happen.]

AzureManagedCluster (AKS) was introduced in CAPZ as an experiment in https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/482. Since then, a lot of progress has been made in adding more features and making the controllers and tests more stable. Part of the reason for building this feature as experimental in the first place was that it was a POC and we wanted to see if there would be any interest at all in managing AKS clusters through Cluster API. It's now abundantly clear that users want this. Through conversations during office hours, pull requests, and issues, we've learned that an increasing number of folks intend on using CAPZ to manage the lifecycle of their AKS clusters. Since we have no intention of deprecating the feature at this point, let's make that message clear to our users and move the AzureManagedCluster CRD to stable, remove the feature flag and enable it by default in the Azure provider.

There are still a number of open issues with regards to managed clusters, they can be found here. Some of these may need to be completed before we can call AKS stable, however it should not be a goal to complete every single issue before graduating AzureManagedCluster. The main goal of graduation is to remove the experimental policy so folks have confidence to use it in production without risk of the feature being deprecated without notice.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

cluster-api-provider-azure version:
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

Mar 29 '22 20:03 CecileRobertMichon

/area managedclusters

Mar 29 '22 20:03 CecileRobertMichon

Through conversations during office hours, pull requests, and issues, we've learned that an increasing number of folks intend on using CAPZ to manage the lifecycle of their AKS clusters.

Definitely agree; this matches my recent experiences. I've also been asked if AKS would support the AzureManagedCluster feature, and while of course the CAPZ project cannot make that determination, marking the feature as stable would definitely be a step in the direction those customers want.

Mar 29 '22 20:03 bridgetkromhout

I think https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/1503 / https://github.com/kubernetes-sigs/cluster-api/issues/4526 are key issues to fix to go stable.

The issue is the code here and here don't consider the full ARM ID for the VMSS instance. It can lead to CAPI confusing two providerIDs as being the same, leading to non-deterministic behavior matching nodes to providerIDs.

I think CAPI's "Equals" should just use the full provider ID, not parse tokens. but we should confirm with CAPA as well, which I think had some region strings which could differ internally. However if we match cloud provider we should always do the "right" thing.

Mar 29 '22 21:03 alexeldeib

+1 to moving it out of experimental, but since capz managed clusters cannot run without machine pool, should we wait for machine pool to be moved out of exp?

Mar 29 '22 22:03 shysank

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jun 27 '22 23:06 k8s-triage-robot

/remove-lifecycle stale

Jun 28 '22 07:06 jackfrancis

/assign

Jul 12 '22 12:07 jackfrancis

@zmalik @luthermonson @karthikbalasub @michalno1 @NovemberZulu @dkoshkin Are folks interested in forming a working group to define a graduation criteria, and to contribute towards the various workstreams that will come out of that?

I'd love to accelerate our work on this as there are already lots of production AKS scenarios leveraging capz AzureManagedCluster, so to a certain extent we want to catch up to the real world!

cc @CecileRobertMichon

Jul 12 '22 12:07 jackfrancis

I am interested! I don't have much experience with this kind of work, but I am willing to help as much as I can.

Jul 13 '22 12:07 NovemberZulu

Curious what folks in here think of https://docs.google.com/document/d/1dMN4-KppBkA51sxXPSQhYpqETp2AG_kHzByXTmznxFA/edit

Technically CAPZ doesn't need to make the same changes as CAPA does as the current AzureManagedCluster solution in theory should work with ClusterClass. However, if we want to be consistent with other managed cluster providers in CAPI in the longterm, it might be a good idea to apply the same refactor anyway. It would be a breaking change so if we do want to do it, it would make sense to do it before moving the feature out of experimental. Thoughts?

Jul 13 '22 17:07 CecileRobertMichon

https://github.com/kubernetes-sigs/cluster-api/pull/6988 is close to merging FYI! 🎉

Aug 24 '22 23:08 jackfrancis

A brief status update:

https://github.com/kubernetes-sigs/cluster-api/pull/6988 has merged

A proposal that outlines graduation out of experimental is here:

https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2602

Oct 10 '22 17:10 jackfrancis

cluster-api-provider-azure cluster-api-provider-azure copied to clipboard

Graduate AzureManagedCluster out of experimental

cluster-api-provider-azure
cluster-api-provider-azure copied to clipboard