volcano icon indicating copy to clipboard operation
volcano copied to clipboard

[Queue] guarantee resource should less or equal than capability

Open wpeng102 opened this issue 3 years ago • 2 comments

What happened: We know if capability is set in Queue, request must be no more than it at all dimensions, queue-resource-reservation-design.md introduce guarantee.resource which support resource reservation for specified queue.

So, guarantee.resource should less or equal than capability . But volcano adminssion does not check this rule.

we can create the following queue sucessfully, and the jobs used can exceed the Queue's capability.

apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: high-priority
spec:
  guarantee:
    resource:
      cpu: 4
      memory: 4096Mi
  capability:
    cpu: 1
    memory: 1024Mi
  reclaimable: true
  weight: 2

What you expected to happen: Adminssion should check capability and guarantee.resource configuration. And give a clear behavior for capability and guarantee.resource features integration.

How to reproduce it (as minimally and precisely as possible):

  1. Create a queue:
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
  name: high-priority
spec:
  guarantee:
    resource:
      cpu: 4
      memory: 4096Mi
  capability:
    cpu: 1
    memory: 1024Mi
  reclaimable: true
  weight: 2
  1. submit vcjob:
apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: test-job
spec:
  minAvailable: 1
  schedulerName: volcano
  policies:
    - event: PodEvicted
      action: RestartJob
  plugins:
    ssh: []
    env: []
    svc: []
  maxRetry: 5
  queue: high-priority
  tasks:
    - replicas: 3
      name: "default-nginx"
      template:
        metadata:
          name: web
        spec:
          containers:
            - image: nginx
              command: ['sh', '-c', 'echo "Hello, Kubernetes!" && sleep 36']
              imagePullPolicy: IfNotPresent
              name: nginx
              resources:
                requests:
                  cpu: 500m
          restartPolicy: OnFailure

Queue's capability is cpu:1, vcjob with 3 pods consume cpu 1500m are all started.

Anything else we need to know?:

Environment:

  • Volcano Version: master
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

wpeng102 avatar Apr 13 '22 11:04 wpeng102

/cc @qiankunli

Thor-wl avatar Apr 15 '22 09:04 Thor-wl

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

stale[bot] avatar Jul 30 '22 18:07 stale[bot]

Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗

stale[bot] avatar Oct 01 '22 00:10 stale[bot]