scheduler-plugins icon indicating copy to clipboard operation
scheduler-plugins copied to clipboard

feat: add kep md

Open LY-today opened this issue 11 months ago • 14 comments

What would you like to be added?

What is your proposal: The NodeResourcesFit plug-in of native k8s can only adopt a type of strategy for all resources, such as MostRequestedPriority and LeastRequestedPriority. However, in industrial practice, this design does not apply to some scenarios. For example: In AI scenarios, businesses that apply for GPUs prefer to occupy the entire GPU machine first to prevent GPU fragmentation; businesses that apply for CPU & MEM are prioritized and dispersed to non-GPU machines to prevent excessive consumption of CPU & MEM on GPU machines, resulting in real tasks of applying for GPUs. Pending due to insufficient non-GPU resources . It is therefore hoped that both strategies can be extended to address this business need.

Why is this needed: There are related descriptions above

Is there a suggested solution, if so, please add it:

plugin-one

config:

resources: 
  nvidia.com/gpu:
    type: MostAllocated
    weight: 2
  cpu:
    type: LeastAllocated
    weight: 1
  memory:
    type: LeastAllocated
    weight: 1

config description: image

node score:

finalScoreNode = [(weight1 * resource1) + (weight2 * resource2) + … + (weightN* resourceN)] /(weight1+weight2+ … +weightN)

plugin-two

config:

resources: 
- nvidia.com/gpu 

config description: image

node score:

finalScoreNode = (allocatablesResourcesNum - requestsResourcesNum) * framework.MaxNodeScore / allocatablesResourcesNum

Why is this needed?

It’s introduced above

LY-today avatar Dec 24 '24 03:12 LY-today

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Dec 24 '24 03:12 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LY-today Once this PR has been reviewed and has the lgtm label, please assign huang-wei for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Dec 24 '24 03:12 k8s-ci-robot

Hi @LY-today. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Dec 24 '24 03:12 k8s-ci-robot

Deploy Preview for kubernetes-sigs-scheduler-plugins canceled.

Name Link
Latest commit 956d7f5cb3535a9b5c705ca5a97e706a382d5bfc
Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-scheduler-plugins/deploys/676a63e8cf3a17000811db0a

netlify[bot] avatar Dec 24 '24 03:12 netlify[bot]

@googs1025 KEP

LY-today avatar Dec 24 '24 06:12 LY-today

@googs1025 KEP

@googs1025 Can you help advance this MR?

LY-today avatar Dec 25 '24 07:12 LY-today

@googs1025 KEP

@googs1025 Can you help advance this MR?

Thanks for the invite, I'll handle this on weekend :)

googs1025 avatar Dec 25 '24 09:12 googs1025

@googs1025 KEP

@googs1025 Can you help advance this MR?

Thanks for the invite, I'll handle this on weekend :)

thank you for your time

LY-today avatar Dec 25 '24 09:12 LY-today

@googs1025 Hello, do you have any clear plans for these two plugins?

LY-today avatar Dec 31 '24 03:12 LY-today

@swatisehgal @zwpaper Please check

LY-today avatar Jan 02 '25 07:01 LY-today

Who can pay attention to this PR?

LY-today avatar Jan 06 '25 03:01 LY-today

@Huang-Wei Regarding plugin-2, how should I modify KEP? Is there any reference? Or is there something you don’t understand?

LY-today avatar Feb 08 '25 06:02 LY-today

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar May 09 '25 06:05 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jun 08 '25 07:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Jul 08 '25 08:07 k8s-triage-robot

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Jul 08 '25 08:07 k8s-ci-robot