volcano icon indicating copy to clipboard operation
volcano copied to clipboard

support min-max elastic quota scheduling

Open lowang-bh opened this issue 1 year ago • 6 comments

Image this scene:queue has some guranteed quota called Min. Pods can be scheduled when queue's used quota + requests <= Min. And there is also a limit quota called Max, which means the upper quota a queue's tasks can used. While the quota between Min and Max can only be used by preemtable tasks in a queue, because those quota are borrowed from other queues' Min, and should be returned back when need. So preemtable pods can be scheduled when queue's used quota + requests <= Max.

This feature also called Elastic Quota or Capacity Scheduling. referance: capacity-scheduling Elastic Quota Management

This Pr will do those things base on capacity plugin: Min equals to deserved and Max equals to capability

  1. Add a feature switch to enable or disable this feature, so that origin function will not be effected.
  2. Add an overused function in capacity plugin, this function make sure queue's used will be under Min if job's tasks are not preemptable, or queue's used will be under Max if job's tasks are preemptable when schedule a job.
  3. Change Preemtive function in capacity plugin to support check if queue's future used (a job's request + queue's allocated) is under Min. Only a job in a queue whose futrue used will not exceed its Min can preemt other victims.

relative issues: #3537 fixes #3703

The 1st commit is base on https://github.com/volcano-sh/volcano/pull/3649, please merge that PR first.

lowang-bh avatar Sep 01 '24 02:09 lowang-bh

/assign @william-wang @wangyang0616 @Monokaix @hwdef

lowang-bh avatar Sep 01 '24 03:09 lowang-bh

seems it's a little complex for users to use the capacity plugin or queue capability, and the problem in https://github.com/volcano-sh/volcano/issues/3703 is really a common case?

Monokaix avatar Sep 02 '24 03:09 Monokaix

seems it's a little complex for users to use the capacity plugin or queue capability, and the problem in #3703 is really a common case?

Another solution is to add a min-max plugin. But it also need modify some codes in main actions.

lowang-bh avatar Sep 06 '24 05:09 lowang-bh

or queue's used will be under Max if job's tasks are preemptable when schedule a job.

ssn.Allocatable holds the capability check logic now.

Monokaix avatar Sep 06 '24 06:09 Monokaix

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please assign hwdef You can assign the PR to them by writing /assign @hwdef in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

volcano-sh-bot avatar Sep 07 '24 02:09 volcano-sh-bot

Please rebase the master code

hwdef avatar Nov 18 '24 03:11 hwdef