volcano icon indicating copy to clipboard operation
volcano copied to clipboard

Make sure volcano schduler cache synced before first scheduling by waiting for handlers sync.

Open JamesBrianD opened this issue 2 years ago • 11 comments

Ⅰ. Describe what this PR does

The issue https://github.com/kubernetes/kubernetes/issues/116717 mentions the bug that event handlers hadn't handled all events when informer cache synced. This can lead to a terrible result, which is that the scheduler starts scheduling in the wrong state. The K8s community itself has fixed this issue https://github.com/kubernetes/kubernetes/pull/116729.

The PR makes sure handlers have finished syncing before the scheduling cycles start, just like the default scheduler does.

JamesBrianD avatar Nov 05 '23 09:11 JamesBrianD

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please assign thor-wl You can assign the PR to them by writing /assign @thor-wl in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

volcano-sh-bot avatar Nov 05 '23 09:11 volcano-sh-bot

same

It looks like the two are similar. If the other one is merged, I will close this pr.

JamesBrianD avatar Nov 06 '23 03:11 JamesBrianD

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Mar 17 '24 09:03 stale[bot]

We'd better include this feature in new release. @Monokaix @william-wang

lowang-bh avatar May 14 '24 06:05 lowang-bh

Please rebase your pr: )

Monokaix avatar May 14 '24 06:05 Monokaix

I will rebase the code as soon as possible

JamesBrianD avatar May 16 '24 09:05 JamesBrianD

@lowang-bh The code rebase is done. Could you review these codes, please?

JamesBrianD avatar May 17 '24 05:05 JamesBrianD

Does volcano controller also need catch this?

Monokaix avatar Jun 14 '24 08:06 Monokaix

Does volcano controller also need catch this?

I don't think so. The controller just needs to make sure it's eventual consistency, waiting for cache sync is enough

JamesBrianD avatar Jun 15 '24 03:06 JamesBrianD

@RamezesDong: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

volcano-sh-bot avatar Jul 28 '24 06:07 volcano-sh-bot