volcano Handle coscheduling with cluster-autoscaler

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:

As is referenced in the docs, the scheduler doesn't currently work well when the cluster-autoscaler is used. The exception thrown is

Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.0.0.1/api/v1/namespaces/testnamespace/pods. Message: admission webhook "validatepod.volcano.sh" denied the request: failed to create pod <testnamespace/test-job-driver> as the podgroup phase is Pending. Received status: Status(apiVersion=v1, code=500, details=null, kind=Status, message=admission webhook "validatepod.volcano.sh" denied the request: failed to create pod <testnamespace/test-job-driver> as the podgroup phase is Pending, metadata=ListMeta(_continue=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=null, status=Failure, additionalProperties={}).

As can be seen in the exception, the cluster has not yet scaled to meet the resource requests to move the podgroup past the Pending phase.

What you expected to happen:

Ideally, the scheduler would be aware the the cluster is autoscaling and hold off on throwing an exception.

How to reproduce it (as minimally and precisely as possible):

Since this exception occurs when the cluster is autoscaling, the submission will be dependent on how many resources are currently available in your cluster and whether it needs to scale to handle the requested resources. Any manifest that leverages the volcano batch scheduler that requires a cluster-autoscaling event will likely throw the above error.

Feb 06 '20 15:02 groszewn

/cc @thandayuthapani

Feb 06 '20 17:02 k82cn

/assign

Feb 08 '20 02:02 k82cn

Hey @k82cn, just wanted to circle back and see if there has been any shift in prioritization on this?

Apr 29 '20 23:04 groszewn

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

Aug 18 '20 06:08 stale[bot]

Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗

Oct 17 '20 07:10 stale[bot]

Hi, any update on this?

This would determine if with use volcano or not.

Dec 02 '20 20:12 SkinyMonkey

Hi, any update on this?

This would determine if with use volcano or not.

what's your scenario?

Dec 02 '20 23:12 k82cn

I would like to be able to use Volcano but allow the autoscaler to kick in when too much jobs are waiting.

I'm not sure that this feaure exists yet or is even possible.

As this ticket was closed without a PR I assume the original pb, wasn't examined?

Also what would happen if I had volcano in place + autoscaler and nodes number set to 0?

We'd like to able to cut our machines when they're not used, on the week end for example, which the autoscaler can provide but not Azure, which does not allow to schedule timed shutdowns on k8s clusters.

So if we were to use volcano but could not use the autoscaler that might be a block

Dec 07 '20 19:12 SkinyMonkey

Hi @k82cn @kevin-wangzefeng , do you have any update on this?

Feb 08 '21 19:02 bowenli86

I wonder, since Volcano doesn't support autoscaling, how does Huawei handle the autoscale requirements? any workaround you can share?

Feb 08 '21 20:02 bowenli86

Also what would happen if I had volcano in place + autoscaler and nodes number set to 0?

@SkinyMonkey , Volcano can work with autoscaling. If node is set to 0, no pod will be scheduled :) This issue is talking about gang-scheduling with autocaling, one feature of Volcano.

, since Volcano doesn't support autoscaling

Volcano can work with autoscaling. For this issue, it's about how gang-scheduling/co-scheduling work with autoscaling :)

Feb 08 '21 23:02 k82cn

@k82cn Seems like a pretty common use-case to use cluster-autoscaler in cloud environments to minimize cloud cost waste and also take advantage of gang-scheduling/co-scheduling. I am also running into this error in the same way as the OP using Spark.

Do you have any update or workaround on this? Much appreciated.

Feb 22 '21 18:02 brickyard

@brickyard , we're doing some investigation about this requirement; that's interesting & important for us.

Mar 07 '21 09:03 k82cn

I had the idea to create ‘over-provisioned’ pods with a very low priorityclass at the same time as submitting my spark job, with the thought that volcano would be smart enough about pre-emption and pre-empt those pods and schedule the job, as it would technically have enough resources to fulfil the job, but that doesn’t seem to be the case. Would this be accurate that volcano wouldn’t look at this as part of deciding whether or not it can schedule a job. Has anyone gone down this road yet?

Mar 16 '21 14:03 aleclerc-sonrai

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

Jun 14 '21 19:06 stale[bot]

Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗

Aug 13 '21 20:08 stale[bot]

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

Nov 14 '21 06:11 stale[bot]

/cc @qiankunli

Dec 02 '21 03:12 Thor-wl

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

Mar 02 '22 09:03 stale[bot]

edit: this is incorrect

Is there any update on this? Would be nice if the overcommit plugin could be extended to handle this case. I'm running into this issue trying to launch TPU based jobs on GKE

https://cloud.google.com/tpu/docs/kubernetes-engine-setup

spec:
  containers:
  - name: example-container
    resources:
      limits:
        cloud-tpus.google.com/v2: 8

Volcano is refusing to schedule these jobs since there's no TPU available hosts, but there's no TPU available hosts because there's no pods scheduled. It doesn't seem like there's any way to force GKE to allocate TPU hosts

Apr 21 '22 00:04 d4l3k

Just spent some more time here and this is actually incorrect. Volcano can schedule TPU jobs without issue, I just was missing an annotation.

Without the annotation the job just gets stuck without any errors

  annotations:
     tf-version.cloud-tpus.google.com: "2.6.0"

Apr 21 '22 17:04 d4l3k

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

Jul 30 '22 18:07 stale[bot]

Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗

Oct 01 '22 00:10 stale[bot]

@Thor-wl are there any plans to improve this?

Oct 31 '22 23:10 d4l3k

@Thor-wl @k82cn can you please give an update on this? Does gang-scheduling work with cluster-autoscaler? We are exploring if Volcano is a good solution for us

May 20 '23 08:05 anovv

volcano volcano copied to clipboard

Handle coscheduling with cluster-autoscaler

volcano
volcano copied to clipboard