volcano icon indicating copy to clipboard operation
volcano copied to clipboard

Fix panic in proportion plugin

Open svileex opened this issue 2 years ago • 11 comments

We must subtract guaranteed resources from the remaining resources before starting the division. If we don't do this, it's possible to allocate more resources than we have available in the cluster. For instance, in this scenario:

Suppose we have three queues - one with a significant guarantee of resources and two other queues. All three queues have jobs, so they will be handled in 'queueOpts.' If we start dividing resources from queues with no guarantee resources, then we give 'remaining/3' resources to each of them. However, the queue with guaranteed resources may take more than 'remaining/3' resources (it takes the maximum of 'attr.guarantee' and 'deserved'), which can result in a panic in the 'remaining.Sub' function.

This behavior is undefined because the order of elements in the map is random, but I have observed and reproduced this behavior many times. So, I have added a test to capture this behavior, specifically testing the panic without the fix. I have also added the fix.

Also, I'm going to create an issue with stack trace

svileex avatar Oct 12 '23 19:10 svileex

Welcome @svileex!

It looks like this is your first PR to volcano-sh/volcano 馃帀.

Thank you, and welcome to Volcano. :smiley:

volcano-sh-bot avatar Oct 12 '23 19:10 volcano-sh-bot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please assign thor-wl You can assign the PR to them by writing /assign @thor-wl in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

volcano-sh-bot avatar Oct 12 '23 19:10 volcano-sh-bot

Issue: https://github.com/volcano-sh/volcano/issues/3155

svileex avatar Oct 12 '23 19:10 svileex

/assign @thor-wl

svileex avatar Oct 23 '23 14:10 svileex

Hey, there hasn't been a review since October. Did I forget to do something?

svileex avatar Nov 09 '23 11:11 svileex

Hey @Thor-wl @lowang-bh see this mr please

Cdayz avatar Dec 19 '23 09:12 Cdayz

/assign @william-wang

Cdayz avatar Jan 16 '24 15:01 Cdayz

@william-wang @Thor-wl check this mr please

Cdayz avatar Jan 29 '24 11:01 Cdayz

/ok-to-test

Monokaix avatar Apr 07 '24 12:04 Monokaix

/retest

Monokaix avatar Apr 07 '24 12:04 Monokaix

@svileex: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

volcano-sh-bot avatar Apr 08 '24 02:04 volcano-sh-bot