[Exoscale] Unable to resize node pools
Which component are you using?:
cluster-autoscaler
What version of the component are you using?:
Cluster Autoscaler v1.23.0 Kubernetes 1.22.8
What k8s version are you using (kubectl version)?:
kubectl version Output
$ kubectl versionClient Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:51:05Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.8", GitCommit:"7061dbbf75f9f82e8ab21f9be7e8ffcaae8e0d44", GitTreeState:"clean", BuildDate:"2022-03-16T14:04:34Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
What did you expect to happen?:
At the beginning of this week CA used to work as expected, able to resize Exoscale SKS managed instance pools but it just stopped to work, very slow in comparison to Auto Scaler running on EKS though (CA on Exoscale takes ~5-15 min to scale up a new node, whereas it takes less than 1 min on AWS - this is other topic).
Now, CA on Exoscale isn't able to resize the instance pools, throwing error 403 and 9999 exception by Exoscale API. We are receiving the following error, indicating a problem or a change in the Exoscale API:
Scale-up failed for group 18e037a4-1e08-4f44-ac72-1285f4cf973d: API error ErrorCode(403) 403 (ServerAPIException 9999): Operation scaleInstancePool on resource 18e037a4-1e08-4f44-ac72-1285f4cf973d is forbidden - reason: Locked by nodepool 00edf882-3199-459b-8d55-ea330776803e on cluster 9745a733-6d0e-4846-9522-9621caf49b65
Is there any change to the Exoscale API and does it require some fix to the Cluster Autoscaler implementation by Exoscale code owners in regards to get it working again?
I'm willing to provide you more details if requested, at this moment I don't know what I should give you to help with this issue.
@pierre-emmanuelJ @7fELF @PhilippeChepy
Looks like there is a forked project that fixes this issue and a PR submitted
https://github.com/exoscale/autoscaler-1/tree/sks/cluster-autoscaler/cloudprovider/exoscale
https://github.com/kubernetes/autoscaler/pull/4247
The PR in regards to SKS Nodepools was merged: https://github.com/kubernetes/autoscaler/pull/4247 So this can be closed
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.