volcano icon indicating copy to clipboard operation
volcano copied to clipboard

Volcano job natively support Ray framework

Open Monokaix opened this issue 8 months ago • 7 comments

What is the problem you're trying to solve

Ray is a popular AI framework that has been widely used, and ray operator has already supported Volcano as a batch scheduler, see:https://github.com/ray-project/kuberay On the other hand, many users still use volcano job to run their job instead of new ray API,so it's reasonable to support ray natively on volcano job.

Describe the solution you'd like

Currently volcano supports distributed AI and HPC framework like pytorch, tensorflow, mpi. Just like what vcjobs currently do, and a new plugin named ray in pkg/controllers/job/plugins/distributed-framework, so users can submit a vcjob and actually run a ray job.

Additional context

Monokaix avatar Apr 10 '25 06:04 Monokaix

/good-first-issue

Monokaix avatar Apr 10 '25 06:04 Monokaix

@Monokaix: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to this:

/good-first-issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

volcano-sh-bot avatar Apr 10 '25 06:04 volcano-sh-bot

@MondayCha: GitHub didn't allow me to assign the following users: me.

Note that only volcano-sh members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to this:

/assign me

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

volcano-sh-bot avatar Apr 10 '25 06:04 volcano-sh-bot

/assign

MondayCha avatar Apr 10 '25 07:04 MondayCha

/assign

de6p avatar Apr 16 '25 06:04 de6p

/assign

Sorry, this issue has been assigned to @MondayCha and you can find other issues that need to be resolved: )

Monokaix avatar Apr 16 '25 06:04 Monokaix

I just created a PR. I noticed there was no progress, so I decided to start working on it.

de6p avatar Apr 16 '25 06:04 de6p

@JesseStutler @Monokaix can I work on this pls

blazethunderstorm avatar Jul 02 '25 01:07 blazethunderstorm

/unassign

MondayCha avatar Jul 02 '25 11:07 MondayCha

/assign

blazethunderstorm avatar Jul 02 '25 12:07 blazethunderstorm

@JesseStutler I have solved issue can I raise pr

blazethunderstorm avatar Jul 02 '25 12:07 blazethunderstorm

@de6p Have you done it ?

JesseStutler avatar Jul 03 '25 03:07 JesseStutler

https://github.com/volcano-sh/volcano/pull/4193

de6p avatar Jul 04 '25 04:07 de6p

@de6p Excuse me, Could you consider using my code for ray plugin? For #4581 , the code, architecture, docs, user-guide and tests(UT & E2E) have been prepared.

Wonki4 avatar Aug 31 '25 12:08 Wonki4

/assign

Wonki4 avatar Sep 13 '25 16:09 Wonki4