turtles
turtles copied to clipboard
Unable to enable Azure CAPIProvider
What steps did you take and what happened?
The Azure CAPIProvider fails to enable correctly in the 0.4.0 capi ui. The status changes from Provisioning to Ready but eventually becomes Unavailable. The capz-controller-manager pod is in CrashLoopBackoff with the error
“failed to get informer from cache” err=“failed to get API group resources: unable to retrieve the complete list of server APIs: bootstrap.cluster.x-k8s.io/v1beta1: the server could not find the requested resource” logger=“controller-runtime.source.EventHandler”
What did you expect to happen?
I would expect the capz-controller-manager pod to be running and azure capiprovider resource to be in Ready state
How to reproduce it?
No response
Rancher Turtles version
No response
Anything else you would like to add?
No response
Label(s) to be applied
/kind bug
Hi @mantis-toboggan-md, thanks for reporting this. I was able to reproduce the issue with the following configuration:
- Rancher v2.8.2
- Rancher Turtles v0.7.0
- Rancher Turtles UI v0.4.0
Looks like this may be related to missing resources bootstrap.cluster.x-k8s.io. This custom resource is generally available via Kubeadm but, since Turtles is using RKE2 for boostrap and control plane provisioning, CAPRKE2 is providing this resource instead.
For some reason CAPZ is not detecting the api resource via RKE2 but, if installing Kubeadm and the re-trying CAPZ installation, it applies the changes successfully.
I did apply this yaml file before installing CAPZ:
---
apiVersion: v1
kind: Namespace
metadata:
name: capi-kubeadm-bootstrap-system
---
apiVersion: turtles-capi.cattle.io/v1alpha1
kind: CAPIProvider
metadata:
name: kubeadm-bootstrap
namespace: capi-kubeadm-bootstrap-system
spec:
name: kubeadm
type: bootstrap
version: v1.4.6
configSecret:
name: variables
---
apiVersion: v1
kind: Namespace
metadata:
name: capi-kubeadm-control-plane-system
---
apiVersion: turtles-capi.cattle.io/v1alpha1
kind: CAPIProvider
metadata:
name: kubeadm-control-plane
namespace: capi-kubeadm-control-plane-system
spec:
name: kubeadm
type: controlPlane
version: v1.4.6
configSecret:
name: variables
And then the Azure provider was successfully installed via Rancher UI and the controller did not report any errors.
The custom resource that the logs report as missing should be available via the RKE2 provider so we need to investigate this a bit further to propose a solution.
Could it be because there is no bootstrap.cluster.x-k8s.io/v1beta1 available in CAPRKE2, but only v1alpha1
Opened a new upstream issue https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/4854 to track the fix on CAPZ. Once the community accepts this proposal, we'll submit the PR effectively removing the dependency on Kubeadm when enabling MachinePools.
Upstream PR: https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/4868
done