autoscaler
autoscaler copied to clipboard
Version Skew Policy
During upgrades of kubernetes API server and/or cluster autoscaler upgrade it is inevitable that the autoscaler and other kubernetes components must have a variance in version for at least a period of time. Looking at the various relevant docs the cluster autoscaler is not included in the overall Kubernetes Version Skew Policy here: https://kubernetes.io/releases/version-skew-policy/
Looking at the README on this repo I see
Some user reports indicate successful use of a newer version of Cluster Autoscaler with older clusters, however, there is always a chance that it won't work as expected.
Is there a supported process to upgrade both the cluster and the autoscaler such that it is expected to continue to work throughout? It seems like the CA needs to have an official version skew policy supporting at least one minor version in one direction from the API server.
Hi, AFAIK It's strongly recommended to keep the minor version of the Cluster Autoscaler matching the version of the k8s cluster it's deployed to give the strong coupling of the logic between the two due to the vendoring of the scheduler into the CA. and there is no 1:1 matching of patch releases of the CA to k8s version.
/cc @gjtempleton
On a similar topic, do we need to ensure cluster-autoscaler image tag values when installing cluster-autoscaler via helm charts in eks clusters? cluster-autoscaler helm chat v9.19.2 default value for "image.tag" is "v1.23.0". Is the chart version compatible with eks v1.22 cluster and are there any instructions?
@Shubham82 I understand that recommendation but if, for example, I am going to upgrade my cluster from 1.20 to 1.21 I need to do one of two things:
- I can ugprade my cluster (API server, kube controller manager, etc) to 1.21 first and have a window during which I have a 1.21 cluster and a 1.20 autoscaler
- I can upgrade my autoscaler first and have a window during which I have a 1.21 autoscaler and a 1.20 cluster
In either case, as far as I can tell, the autoscaler project is saying it doesn't "support" this setup, so how can I ever upgrade from 1.20 to 1.21 since it's impossible to upgrade the two things at the exact same time.
cluster autoscaler chart for eks 1.22 is covered in issue#4850.. I will repost in that issue.. thanks
@Shubham82 I understand that recommendation but if, for example, I am going to upgrade my cluster from 1.20 to 1.21 I need to do one of two things:
- I can ugprade my cluster (API server, kube controller manager, etc) to 1.21 first and have a window during which I have a 1.21 cluster and a 1.20 autoscaler
- I can upgrade my autoscaler first and have a window during which I have a 1.21 autoscaler and a 1.20 cluster
In either case, as far as I can tell, the autoscaler project is saying it doesn't "support" this setup, so how can I ever upgrade from 1.20 to 1.21 since it's impossible to upgrade the two things at the exact same time.
Hi @riconnon In FAQ, I found this How can I update CA dependencies (particularly k8s.io/kubernetes)?, See if it will answer your question. There is no upgrade guide mentioned under Cluster Autoscaler.
Hi, @gjtempleton Could please take a look?
I'm also trying to reconcile this with the charts. It's not clear to me what approach is best to ensure compatibility. Best I can tell there's two potential approaches:
- Use the latest charts version to ensure we get any potential compatibility or security fixes and pin the specific Autoscaler version image to match the cluster as recommended.
- Find the last charts version that matches the autoscaler version needed and use that. Would result in potentially pretty old charts versions being installed though that could in turn have issues themselves.
Would be great to have official guidance on this.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.