volcano icon indicating copy to clipboard operation
volcano copied to clipboard

supoort preemptionPolicy

Open hwdef opened this issue 1 year ago • 22 comments

What would you like to be added:

priorityclass has a field named preemptionPolicy. https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#non-preempting-priority-class

volcano didn't implement its logic.

solution

In volcano's cache, priorityclas is just a number, so we need to modify the priorityclass in the cache and then implement preemptionpolicy-related logic.

https://github.com/volcano-sh/volcano/blob/c91e42e07b55fbc0686095aedb0f9436b1991163/pkg/scheduler/api/job_info.go#L121

  • [ ] Modify priorityclass in volcano cache, add more information, such as preemptionPolicy
  • [ ] add logic for preemptionpolicy

hwdef avatar Jul 30 '24 09:07 hwdef

/cc

googs1025 avatar Jul 30 '24 09:07 googs1025

/good-first-issue

Monokaix avatar Aug 03 '24 03:08 Monokaix

@Monokaix: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to this:

/good-first-issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

volcano-sh-bot avatar Aug 03 '24 03:08 volcano-sh-bot

/assign

googs1025 avatar Aug 03 '24 03:08 googs1025

I will find time to research and try to implement

googs1025 avatar Aug 03 '24 03:08 googs1025

/assign

Sianao avatar Aug 19 '24 09:08 Sianao

@Sianao Hello, I am currently working on this issue. Can you find other issues? Or do you have any special requirements for this issue?

googs1025 avatar Aug 19 '24 12:08 googs1025

@Sianao Hello, I am currently working on this issue. Can you find other issues? Or do you have any special requirements for this issue?

I unassign it already

Sianao avatar Aug 20 '24 02:08 Sianao

@Sianao Do you want to try this issue? I'm currently busy with other projects and don't have time to do this. :) If you want, feel free to take it. I will review and help in the PR /unassign

googs1025 avatar Sep 10 '24 03:09 googs1025

/assign @Sianao

Monokaix avatar Sep 11 '24 08:09 Monokaix

/unassign

Sianao avatar Sep 11 '24 10:09 Sianao

Sir , I m busy , too .

Sianao avatar Sep 11 '24 10:09 Sianao

I'm glad to work on this.

JesseStutler avatar Sep 18 '24 04:09 JesseStutler

/assign

JesseStutler avatar Sep 18 '24 04:09 JesseStutler

@hwdef Hi, Do we only need to support preemptionPolicy at the pod level? Not to add more fields in job_info. So that we only need to add some logic in preempting, like this: image If the preemptionPolicy is 'Never', not allowing the pod to preempt others, directly skip this pod.

JesseStutler avatar Sep 19 '24 02:09 JesseStutler

We should also consider this case: https://github.com/volcano-sh/volcano/issues/3512 @hwdef @JesseStutler

Monokaix avatar Sep 19 '24 04:09 Monokaix

I think we can keep the property of priority as a number, and add extra info about preemptionPolicy in taskInfo or JobInfor when parse it.

lowang-bh avatar Sep 19 '24 05:09 lowang-bh

@lowang-bh I think get the preeemptionPolicy from pod spec in taskInfo is enough? K8s default scheduler only use the preemptionPolicy https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go#L240-L242 here to judge whether the preemptionPolicy is Never. So there is no need to add extra field I think if we only consider implementing preemptionPolicy at the pod level.

There is the scenario we need to make sure that is whether we need to support preemptionPolicy at the vcjob level(Put field into JobInfo), because vcjob also has priorityClass field. But when vc-controller creates the pod, the PriorityClass in the pod spec was not set to the same value as vcjob, just use the priorityClass specified in the template, we need to discuss how to implement this, or we don't need to implement preemptionPolicy at the vcjob level. @hwdef

JesseStutler avatar Sep 19 '24 06:09 JesseStutler

You are right. If you only implement preemptionpolicy for pods, you do not need to add other fields. But I think it's better implement preemptionPolicy for vcjob.

hwdef avatar Sep 19 '24 11:09 hwdef

You are right. If you only implement preemptionpolicy for pods, you do not need to add other fields. But I think it's better implement preemptionPolicy for vcjob.

What I'm concern is that do we really have a use case for supporting preemptionPolicy for vcjob, becuase the field is just derived from kube-scheduler and is just for pod before: )

Monokaix avatar Sep 20 '24 01:09 Monokaix

If we support preemptionPolicy for vcjob, becasuse preemptionPolicy field comes from priorityclass, I think all pods in vcjob should set the same PriorityClassName with vcjob. In kube-apiserver, there is a plugin called Priority will set the pod's preemptionPolicy field: https://github.com/kubernetes/kubernetes/blob/master/plugin/pkg/admission/priority/admission.go#L193-L200 So we only need to modify vc-controller to set the pod template's PriorityClassName when creating pod, scheduler only need to add logic in preempt and reclaim to skip if the PreemtionPolicy is Never, like this:

if task.Pod.Spec.PreemptionPolicy != nil && *task.Pod.Spec.PreemptionPolicy == v1.PreemptNever {
			klog.V(3).Infof("Task %s/%s is not eligible to preempt other tasks due to preemptionPolicy is Never", task.Namespace, task.Name)
			...
		}

JesseStutler avatar Sep 20 '24 01:09 JesseStutler

You are right. If you only implement preemptionpolicy for pods, you do not need to add other fields. But I think it's better implement preemptionPolicy for vcjob.

What I'm concern is that do we really have a use case for supporting preemptionPolicy for vcjob, becuase the field is just derived from kube-scheduler and is just for pod before: )

I think the answer is yes, we currently have such a scenario. I implemented it through annotation.

hwdef avatar Sep 20 '24 06:09 hwdef

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

stale[bot] avatar Feb 01 '25 01:02 stale[bot]

still need

hwdef avatar Feb 06 '25 03:02 hwdef

This has been done by https://github.com/volcano-sh/volcano/pull/3739, we can close it now.

Monokaix avatar Feb 06 '25 03:02 Monokaix

/close

Monokaix avatar Feb 06 '25 03:02 Monokaix

@Monokaix: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

volcano-sh-bot avatar Feb 06 '25 03:02 volcano-sh-bot