Add a "Do Not Deschedule" label
Is your feature request related to a problem? Please describe.
I have a k8s cluster of small nodes and I'm trying to deploy Sonatype Nexus (OSS version) to it. My own pods are quite small -- all using <500Mi/250m memory and CPU limits. Nexus however is huge by comparison with a 4Gi memory limit and wanting 500m CPU. When I schedule nexus, descheduler notices the memory pressure on its node, notes that other nodes have lots of free memory, and deschedules it. As soon as it is running on this other node for a couple of minutes, descheduler does it again. And again.
The result is that I can't keep nexus alive for more than a couple of minutes.
Describe the solution you'd like The easiest thing for me would be to put a 'do not reschedule' label on the nexus pod, to instruct descheduled to leave this one alone.
Describe alternatives you've considered I know that I can set a priority class to achieve the same thing, but that would be to misuse the priorities available to me (node or cluster critical) What version of descheduler are you using?
descheduler version: Image label: 0.23.1
Additional context I'm using the default yaml
strategies:
LowNodeUtilization:
enabled: true
params:
nodeResourceUtilizationThresholds:
targetThresholds:
cpu: 50
memory: 50
pods: 50
thresholds:
cpu: 20
memory: 20
pods: 20
RemoveDuplicates:
enabled: true
RemovePodsViolatingInterPodAntiAffinity:
enabled: true
RemovePodsViolatingNodeAffinity:
enabled: true
params:
nodeAffinityType:
- requiredDuringSchedulingIgnoredDuringExecution
RemovePodsViolatingNodeTaints:
enabled: true
Hi @BryanDollery, I understand not wanting to overload your existing priority classes when it's not appropriate. Do you have the ability to create new (possibly non-preempting) priority classes? In the past when this has come up, we've chosen to recommend that because it offers a more well-defined eviction hierarchy that aligns with how the scheduler works. But if you aren't able to do that, it could show a use case for implementing this.
(linking the prior issues just so they're tied together https://github.com/kubernetes-sigs/descheduler/issues/422 https://github.com/kubernetes-sigs/descheduler/issues/329)
Any update on this?
"Don't deschedule if already running" is an orthogonal thing from priority class, I think. We have some deployments which have a defined maintenance window and their pods should only be descheduled during that window. New pods from these deployments should not fill the scheduler queue ahead of system critical pods, yet those system critical pods can be descheduled any time since they are HA and have PDBs protecting their minimum availability.
An annotation to prevent descheduling would neatly solve this, and would also solve things like #423 (Re that issue: I personally think adding time-of-day stuff to descheduler is unnecessary complexity, since it's very simple to make a CronJob that removes a not-evictable annotation at the start of a maintenance window and adds it back at the end. We already do this with cluster-autoscaler's safe-to-evict annotation.)
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.