autoscaler
autoscaler copied to clipboard
Cluster autoscaler deleting nodes containing pods with `safe-to-evict: false` annotation
Which component are you using?: Cluster autoscaler
What version of the component are you using?: v1.27.1
Component version:
v1.27.1
What k8s version are you using (kubectl version)?:
kubectl version Output
$ kubectl version
```sh
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.13", GitCommit:"96b450c75ae3c48037f651b4777646dcca855ed0", GitTreeState:"clean", BuildDate:"2024-04-16T15:03:38Z", GoVersion:"go1.21.9", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27+", GitVersion:"v1.27.16-eks-2f46c53", GitCommit:"c1665482a8b066c35d81db51f8d8cc92aa598040", GitTreeState:"clean", BuildDate:"2024-07-25T04:23:25Z", GoVersion:"go1.22.5", Compiler:"gc", Platform:"linux/amd64"}
```
What environment is this in?: EKS - AWS
What did you expect to happen?:
The Autoscaler sees the pod annotation cluster-autoscaler.kubernetes.io/safe-to-evict: false and respects it. Waiting for the pod to complete/finish before removing the node it is living on.
What happened instead?:
The scale-down does NOT respect the cluster-autoscaler.kubernetes.io/safe-to-evict: false annotation and deleted the node, therefore killing my very-important running pod.
How to reproduce it (as minimally and precisely as possible):
Have your autoscaler with these values at v1.27:
./cluster-autoscaler
--cloud-provider=aws
--namespace=kube-system
--node-group-auto-discovery=tagstagstags
--logtostderr=true
--stderrthreshold=info
--v=4
## ASG configs:
# Desired: 2
# minimum: 1
Spawn your very important pod that shouldn't be killed:
apiVersion: v1
kind: Pod
metadata:
namespace: awx
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
spec:
containers:
- image: 'quay.io/ansible/awx-ee:23.1.0'
name: worker
args:
- ansible-runner
- worker
- '--private-data-dir=/runner'
resources:
limits:
memory: 2Gi
cpu: 2
requests:
memory: 500Mi
cpu: 500m
tolerations:
- key: nodegroup-type
operator: "Equal"
value: on-demand
nodeSelector:
eks.amazonaws.com/capacityType: ON_DEMAND
Afterwards add some resource-locking deployment like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: exhaust-resources
namespace: awx
spec:
replicas: 5
selector:
matchLabels:
app: exhaust-resources
template:
metadata:
labels:
app: exhaust-resources
spec:
tolerations:
- key: nodegroup-type
operator: "Equal"
value: on-demand
nodeSelector:
eks.amazonaws.com/capacityType: ON_DEMAND
containers:
- name: exhaust-resources
image: busybox
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "2Gi"
cpu: "1000m"
command: ["sh", "-c", "while true; do echo 'Running...'; sleep 30; done;"]
This will trigger a scale up of the pods. When the scale down happens, cross your fingers that the initial pod is not killed on the way. The annotation for the pod won't be respected at all.
Anything else we need to know?:
I have some hypothesis, like maybe the Instance scale-in protection from the ASG is disabled by default. And this may take precedence over any AutoScaler will.
Another one is that the annotation should be on a deployment level, because my very important workload is running directly from a pod. (no rs/deployment on top of it).
/area cluster-autoscaler
Anyone knows what is the latest version of cluster-autoscaler that it doesn't have this bug?
Anyone knows what is the latest version of cluster-autoscaler that it doesn't have this bug?
I don't know, but I could manage to "work around it" by setting a crazy high Pod Disruption Budget with MinAvailable.
@blueprismo have experienced a similar issue, we suspended the AZRebalance process (under ASG -> Advanced Configuration) on the ASG itself. We suspected this process was killing nodes to rebalance (outside of autoscaler control), causing behavior that seemed as though auto scaler wasn't respecting safe-to-evict annotation.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.