vllm [V1][PP] Fix & Pin Ray version in requirements-cuda.txt

Pipeline parallelism in V1 requires ray[adag] instead of ray[default]. Also, because of the API changes in 2.42.0, we have to pin the version to 2.41.0 (or 2.40.0).

Feb 17 '25 23:02 WoosukKwon

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Feb 17 '25 23:02 github-actions[bot]

cc @ruisearch42 @richardliaw

Feb 18 '25 02:02 comaniac

i think it’s fine, as long as vllm does not directly use cupy . one thing about cupy is, it has cupy-cuda11x and cupy-cuda12x . I’m not sure how ray deals with it. will it break the cuda 11.8 build of vllm?

Feb 18 '25 03:02 youkaichao

@youkaichao It seems to use cupy-cu12. However, IIUC, it doesn't break anything on our cu11.8 build unless the user explicitly chooses Ray?

Feb 18 '25 03:02 WoosukKwon

@youkaichao It seems to use cupy-cu12. However, IIUC, it doesn't break anything on our cu11.8 build unless the user explicitly chooses Ray?

sounds good, then it's a ray-related issue, whether they want to support cuda 11.8 . we can go ahead with ray[adag] .

Feb 18 '25 03:02 youkaichao

ray[adag] uses cupy-cuda12x. BTW, there is an issue in ray 2.42 and is being fixed. After that we can upgrade to the latest version with a small API change.

Feb 18 '25 05:02 ruisearch42

The issue with pinning to a specific Ray version is that anyone with a long running cluster will not be able to upgrade vLLM services unless they upgrade the entire Ray cluster.

Please can we rather look at a range specifier (i.e. ray[adag]>=2.43.0), once 2.43.0 comes out with the fix?

Feb 25 '25 01:02 darthhexx

Hi @darthhexx , that makes sense. This is a short term fix and the plan is to support a ray version range.

Feb 25 '25 01:02 ruisearch42