volcano icon indicating copy to clipboard operation
volcano copied to clipboard

Pod Scheduling Readiness

Open ykcai-daniel opened this issue 1 year ago • 6 comments

Updated PR of #3612, a gated task no longer blocks the job. Do not use new state.

ykcai-daniel avatar Aug 07 '24 02:08 ykcai-daniel

Does this change have the same purpose as this PR https://github.com/volcano-sh/volcano/pull/3612 ?

googs1025 avatar Aug 07 '24 02:08 googs1025

Hi,I think there is no necessary to submit two PRs, just based the previous PR and update it OK,and please also update the design doc.

Monokaix avatar Aug 08 '24 03:08 Monokaix

Please refer to kube-scheduler to add a featuregate to enable/disable this feature.

Monokaix avatar Aug 13 '24 03:08 Monokaix

@Monokaix PR updated. I will also update the design proposal #3581 to document the changes.

ykcai-daniel avatar Aug 15 '24 02:08 ykcai-daniel

PR summary: Pod Scheduling gates is a new feature in K8S, which uses the .spec.schedulingGates field of Pod to signal that a Pod is not ready for scheduling (Pod will be in pending state when gated). This feature is often used to implement customized resource managers and we want to support it for compatibility. Main Changes involve:

  1. Scheduling Actions: Job with scheduling gated tasks mainly exist in the Inqueue state. In allocate action, if the pod of a task is scheduling gates, it will be skipped for allocation and not bound to a node. Consequently, if too many tasks are scheduling gates, the job will be gang-unschedulable
  2. Plugins: Since scheduling gated pods are not ready to be allocated, we don't want scheduling gated tasks to consume inqueue resources, making other jobs uninqueuable. Therefore, proportion, capacity and overcommit plugins have to be changed to deduct the resources of scheduling gated tasks
  3. Pod Conditions: When each session close, the Condition of scheduling gated pods will not be updated, showing the Schedulable False condition with reason ScheduleGated, which is created by api-server

Future Works that are not covered in this PR:

  1. Now, we do not support removing scheduling gates by modifying the template of a vcjob because supporting this feature involve major changes to controller and Pod-level gates removal is sufficient for most usecase. In the future, if there is a need for task template support, we can add it.
  2. Events and Correct Message: Now, if a Job is gang unschedulable due to scheduling gates, the reason in condition is "NoEnoughResources". We might need better reason and message for scheduling gates.

ykcai-daniel avatar Aug 15 '24 02:08 ykcai-daniel

Commit Author: /assign @ykcai-daniel

ykcai-daniel avatar Aug 30 '24 01:08 ykcai-daniel

/assign @Monokaix

ykcai-daniel avatar Aug 30 '24 01:08 ykcai-daniel

/assign @googs1025

ykcai-daniel avatar Aug 30 '24 01:08 ykcai-daniel

/assign @lowang-bh @hwdef

Monokaix avatar Aug 30 '24 03:08 Monokaix

@lowang-bh cc @Monokaix Pull request updated. Please review the changes. Thank you!

ykcai-daniel avatar Sep 02 '24 02:09 ykcai-daniel

For more details about the design of the PR, see #3581 . Please also merge the design doc.

ykcai-daniel avatar Sep 03 '24 03:09 ykcai-daniel

/lgtm /approve

Monokaix avatar Sep 11 '24 01:09 Monokaix

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Monokaix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

volcano-sh-bot avatar Sep 11 '24 01:09 volcano-sh-bot