[Proposal] Add queue-level-scheduling-policy
What type of PR is this?
/kind documentation /area scheduling /area controllers
What this PR does / why we need it:
Queue-level-scheduling-policy from a LFX'25 issue:https://github.com/volcano-sh/volcano/issues/3992, which requires the volcano to support setting and using different scheduling policies at the queue level instead of using a globally unified scheduling policy.
Which issue(s) this PR fixes:
Fixes https://github.com/volcano-sh/volcano/issues/3992
Special notes for your reviewer:
@Monokaix @JesseStutler
Does this PR introduce a user-facing change?
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by:
To complete the pull request process, please assign hwdef
You can assign the PR to them by writing /assign @hwdef in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
I’ve updated the proposal, and I think it should support both queue-level and job-level scheduling. cc @Monokaix @JesseStutler
@hwdef @kingeasternsun Also supported the job level scheduling, please take a look.
Overall it's ok, but I prefer cr to configmap
I think we should modify a somewhat annoying legacy problem, that is, if the parsing configuration fails, the default configuration will be used. This is implicit. I hope that the reading configuration fails and panics directly
Overall it's ok, but I prefer
crtoconfigmapI think we should modify a somewhat annoying legacy problem, that is, if the parsing configuration fails, the default configuration will be used. This is implicit. I hope that the reading configuration fails and panics directly
That's a good point.
Overall it's ok, but I prefer
crtoconfigmapI think we should modify a somewhat annoying legacy problem, that is, if the parsing configuration fails, the default configuration will be used. This is implicit. I hope that the reading configuration fails and panics directly
The problem of cr is that it's not consistent with current global config which is in configMap.
Overall, we have three approaches to mount scheduling policies:
-
Using a single ConfigMap — This is similar to the implementation in the current proposal. The advantage of this method is its simplicity and consistency with the default scheduler configuration file. However, it poses a risk where users may unintentionally (or intentionally) modify scheduling policies that do not belong to them.
-
Using one ConfigMap per scheduling policy — This approach prevents users from modifying other users’ scheduling policies, offering better isolation. However, it could lead to a large number of ConfigMaps, making file mounting and scheduler access more complex.
-
Using a Custom Resource Definition (CRD) — This method provides better structure and flexibility, but its usage differs from the default scheduler configuration file, potentially increasing the learning curve or setup complexity.
Overall, we have three approaches to mount scheduling policies:
- Using a single ConfigMap — This is similar to the implementation in the current proposal. The advantage of this method is its simplicity and consistency with the default scheduler configuration file. However, it poses a risk where users may unintentionally (or intentionally) modify scheduling policies that do not belong to them.
- Using one ConfigMap per scheduling policy — This approach prevents users from modifying other users’ scheduling policies, offering better isolation. However, it could lead to a large number of ConfigMaps, making file mounting and scheduler access more complex.
- Using a Custom Resource Definition (CRD) — This method provides better structure and flexibility, but its usage differs from the default scheduler configuration file, potentially increasing the learning curve or setup complexity.
If crd is not accepted, the second one is also good
I have updated the design document and user examples. PTAL @Monokaix @JesseStutler
I suggest that we should add some explanations for queue-level schedule policy: For example, when users previously used cluster-level ( which refers to the previous schedule policy mode ), if binpack is used, the effect that tasks in the cluster will preferentially fill the nodes of existing tasks can be achieved, which can reduce resource fragmentation. However, in queue-level mode, if queue A is configured with binpack, queue B is configured with nodeorder. That standing in the cluster perspective, in fact, did not play a better role in reducing resource fragmentation ; from the queue perspective, the tasks in queue A are preferentially assigned to the nodes with tasks, and the tasks may be more compact. The tasks in queue B are evenly distributed in the cluster according to the node situation.
We 'd better allow users to have a correct expectation of the impact of queue configuration (in queue-level mode) , help them to configure queue-level schedule policy more reasonably, and improve their experience of using volcano.
In addition, in queue-level mode, if queue A uses proportion but queue B uses capacity, will there be a conflict in resource management ?
I suggest that we should add some explanations for queue-level schedule policy: For example, when users previously used cluster-level ( which refers to the previous schedule policy mode ), if binpack is used, the effect that tasks in the cluster will preferentially fill the nodes of existing tasks can be achieved, which can reduce resource fragmentation. However, in queue-level mode, if queue A is configured with binpack, queue B is configured with spread. That standing in the cluster perspective, in fact, did not play a better role in reducing resource fragmentation ; from the queue perspective, the tasks in queue A are preferentially assigned to the nodes with tasks, and the tasks may be more compact. The tasks in queue B are evenly distributed in the cluster according to the node situation.
We 'd better allow users to have a correct expectation of the impact of queue configuration (in queue-level mode) , help them to configure queue-level schedule policy more reasonably, and improve their experience of using volcano.
In addition, in queue-level mode, if queue A uses proportion but queue B uses capacity, will there be a conflict in resource management ?
I think queue capacity management plugins such as proportion and capacity should be configured at the global level?
I think queue capacity management plugins such as
proportionandcapacityshould be configured at the global level?
I think this is a solution, we may need to simply classify the existing plugins, which can only be applied globally, which can be configured at the queue-level. In order to be compatible with the configuration of the previous version of the users, queue-level schedule policy should be based on default schedule policy when encountering a global plugin.