descheduler
descheduler copied to clipboard
Make room for daemon sets (#303 again)
Is your feature request related to a problem? Please describe.
As @runningman84 pointed out
Sometimes daemon sets don't fit into existing nodes. In the case the descheduler should kill a pod which uses the least resources in order to allow the daemon set to run..
This exact use-case is what made me interested in descheduler in the first place.
As @ingvagabund pointed out preemption and priorities would take care of this use case. If that's the case then wouldn't podAffinities and podAntiAffinities take care of other use cases that descheduler addresses.
Describe the solution you'd like
As an SRE, I want to ensure that daemonset like security related workloads are scheduled. IMHO if a pod on a node is preventing a daemonset pod from being scheduled and it's part of a ReplicaSet it should be nuked.
Describe alternatives you've considered
I considered writing my own descheduler...
Nothing fancy just a shell script running as a Kubernetes Cronjob.
What version of descheduler are you using?
descheduler version: 0.23.1
Additional context
none
Hi @unacceptable Just curious, have you tried solving this with preemption and found that it wasn't working for you? I ask because this still sounds like something that preemption in the scheduler should solve, so any more details you can share about the use case could help us address this feature gap.
@damemi Sorry I addressed that in the initial comment but misspelled it.
A lot of helm charts that have a DeamonSet that I deem as critical don't include the priorityClassName
attribute. Granted I could go put in PRs for everything that I deem as business-critical (e.g. log collection utilities, security scanning tools, identity & access management delegation workloads, etc.) but there's no guarantee that the maintainers of those third party projects will agree with and approve my PRs. In that case, I could fork the chart but then that's another thing for me to maintain and patch.
What you're saying is very valid and if there were all internal projects that my team/org maintained I would just do use preemption
.
It seems like there has to be a better way to manage this problem. I was hopeful when I first came across descheduler
that this might be a project to help alleviate this pain point as well as a few others that it already solves.
In my opinion, life is organized chaos. As Engineers, we only have so much time that we can dedicate to certain problems, and the risk of missing a configuration setting is very real (as long as primates are involved).
It would be nice to have a fire-and-forget approach to solving this problem.
As an SRE, I want to ensure that daemonset like security related workloads are scheduled. IMHO if a pod on a node is preventing a daemonset pod from being scheduled and it's part of a ReplicaSet it should be nuked.
This requires a relation between the pod and the DS pod saying "this pod has higher priority than this pod". Which goes back to using preemption and priorities. In the case of descheduler, what would be the mechanism of saying such priority? Even when the descheduler eventually evicts the lower priority pod to make space for the DS pod, the kube-scheduler can decide to schedule another non-DS pod. Have you consider using mutating admission webhook to set the priorityClassName
to all the DSs?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.