cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard
Missing instructions to install aws-cloud-controller-manager
/kind bug
What steps did you take and what happened: There is no instructions on the documentation https://cluster-api-aws.sigs.k8s.io/getting-started#install-a-cloud-provider to install AWS Cloud Provider, only Azure
What did you expect to happen: I was helping a customer to follow instructions from the blog post:
https://aws.amazon.com/blogs/containers/multi-cluster-management-for-kubernetes-with-cluster-api-and-argo-cd/
The blog post provides instructions to generate EC2 Cluster Template: Like for example:
clusterctl generate cluster capi-ec2 --kubernetes-version v1.28.0 --control-plane-machine-count=3 --worker-machine-count=3 > ./capi-cluster/aws-ec2/aws-ec2.yaml
When deploying the cluster, I noticed the worker nodes were provisioned, but the cluster is stuck waiting for "WaitingForAvailableMachines"
clusterctl describe cluster capi-ec2
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/capi-ec2 False Warning ScalingUp 26m Scaling up control plane to 3 replicas (actual 1)
├─ClusterInfrastructure - AWSCluster/capi-ec2 True 26m
├─ControlPlane - KubeadmControlPlane/capi-ec2-control-plane False Warning ScalingUp 26m Scaling up control plane to 3 replicas (actual 1)
│ └─Machine/capi-ec2-control-plane-8s68s True 26m
└─Workers
└─MachineDeployment/capi-ec2-md-0 False Warning WaitingForAvailableMachines 30m Minimum availability requires 3 replicas, current 0 available
└─3 Machines... False Info WaitingForBootstrapData 26m See capi-ec2-md-0-tsl6z-5xctf, capi-ec2-md-0-tsl6z-8rxfj, ...
Investigating the issue further, I found out the node has untolerated taint
Warning FailedScheduling 115s (x62 over 5h7m) default-scheduler 0/4 nodes are available: 1 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}, 3 node(s) had untolerated taint {node.cluster.x-k8s.io/uninitialized: }. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling..
Same issue has been reported here: https://github.com/kubernetes-sigs/cluster-api/issues/9151
The recommendation was to install: aws-cloud-controller-manager on the comment: https://github.com/kubernetes-sigs/cluster-api/issues/9151#issuecomment-1671878672
However there is no instructions on the documentation https://cluster-api-aws.sigs.k8s.io/getting-started#install-a-cloud-provider to install AWS Cloud Provider, only Azure.
/triage accepted /priority important-soon
@maiconrocha looks like the aws cloud controller manager is only published as a helm package as a release asset in GitHub. https://github.com/kubernetes/cloud-provider-aws/releases/tag/helm-chart-aws-cloud-controller-manager-0.0.8
As of now, since it's not hosted somewhere it's a two step process.
$ curl -LO https://github.com/kubernetes/cloud-provider-aws/releases/download/helm-chart-aws-cloud-controller-manager-0.0.8/aws-cloud-controller-manager-0.0.8.tgz
$ helm template aws-cloud-controller-manager-0.0.8.tgz | kubectl apply -f -
Let me know if there's an alternative to handle this.
/assign
This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged.
Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.
You can:
- Confirm that this issue is still relevant with
/triage accepted(org members only) - Deprioritize it with
/priority important-longtermor/priority backlog - Close this issue with
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten