vllm
vllm copied to clipboard
Update PyTorch to 2.7.0
Notable changes:
- PyTorch 2.7.0 has dropped CUDA 12.4, so the remaining options are 12.6 and 12.8
- CUDA 12.6 has build issue https://github.com/vllm-project/vllm/issues/15435#issuecomment-2775924628, so only 12.8 remains
- We need a new xformers, flashinfer, and mamba-ssm packages, so let build them from source for now. They can be installed from pypi once they are built upstream with 2.7.0
- Leave XPU for later for Intel folks to pick it up as it requires a newer version of
intel-extension-for-pytorch
π Hi! Thank you for contributing to the vLLM project.
π¬ Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.
Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.
To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.
π
This pull request has merge conflicts that must be resolved before it can be merged. Please rebase the PR, @huydhn.
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork
Now that torch 2.7 has released (https://pypi.org/project/torch/2.7.0/) can this be updated?
Now that torch 2.7 has released (https://pypi.org/project/torch/2.7.0/) can this be updated?
Yup, it can be updated now. Let me start working on that. On the other hand, I think I will keep the state of this PR as a reference because we plan to do similar validation for the next PyTorch release. Ideally, the validation needs to be done with PyTorch RC before publishing to pypi.
@youkaichao @zou3519 I think some of the failed tests are pointing to a real torch.compile compatibility issue with 2.7.0 https://buildkite.com/vllm/ci/builds/18733#01967697-2cf2-469f-91a7-63e66f589d77/205-1262
I can reproduce it with pytest -v tests/models/test_transformers.py -k test_models[meta-llama/Llama-3.2-1B-Instruct-transformers]
https://paste.sh/qgshOYtT#6dx7jO1lRsFmoSN43N6aAXeE
Any thoughts?
vLLM monkeypatches Dynamo to provide a list of file paths that it traced over. Somehow the monkeypatched function provided the filename "<string>". I'll try to run this and see, my best guess right now is that we need to filter out things that are not filepaths
After I worked around it (by filtering out "<string>") there are more issues. Currently getting:
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: Constraints violated (L['input_ids'].size()[0],
L['positions'].size()[0])! For more information, run with TORCH_LOGS="+dynamic".
- Not all values of RelaxedUnspecConstraint(L['input_ids'].size()[0]) are valid because L['input_ids'].size()
[0] was inferred to be a constant (16384).
- Not all values of RelaxedUnspecConstraint(L['positions'].size()[0]) are valid because L['positions'].size()
[0] was inferred to be a constant (16384).
Sync with @zou3519 offline:
- There are at least 2 problems as mentioned in previous comments. @zou3519 is working on a fix, so the plan is to have that landed, then I can rebase this change
- Maybe the recent transformers bump is related https://github.com/vllm-project/vllm/pull/17116 as the failure seems to come from there
@zou3519 Thank you for the fix! The failures with torch.compile has been resolved.
@simon-mo I think this is ready to land. The remaining failures on CI are:
- 2 failed speculative decode tests, which I think are existing failures from trunk and are not related to this update. I attempt to fix them separately at https://github.com/vllm-project/vllm/pull/17371.
- On the other hand, the python only installation test failure is legit, but I think it can only be fixed by updating the nightly binaries. I have a PR lining up to do that after this land https://github.com/vllm-project/vllm/pull/17224 (need a rebase). Let me know if that's the regular procedure.
If they are ok, please help land this PR and we can finally have vLLM on 2.7.0 :)
Btw, did you need https://github.com/vllm-project/vllm/pull/17338 or was that just because I'm on torch > 2.7?
New release of xformers with torch 2.7 support has come out https://github.com/facebookresearch/xformers/releases/tag/v0.0.30
Btw, did you need #17338 or was that just because I'm on torch > 2.7?
I don't see any failures due to #17338 on CI, so I guess it's only applicable to PyTorch nightly, not 2.7.0 release. On the other hand, if you can land your PR before this PR is merged. I can do a rebase just to be sure.
Btw, did you need #17338 or was that just because I'm on torch > 2.7?
I don't see any failures due to #17338 on CI, so I guess it's only applicable to PyTorch nightly, not 2.7.0 release. On the other hand, if you can land your PR before this PR is merged. I can do a rebase just to be sure.
No need to wait for my PR if the signals on this PR are good
When to sync to pip repository
Does vllm publish nightlies to some pip channel? Asking to try out vllm with pytorch 2.7.0
@vadimkantorov
pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
Couldn't find this nightly instruction in the installation section of README. Might be good to add it there too!
pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
For me, this again tries to fetch 2.6.0 :( despite having 2.7.0 installed:
vllm-0.8.5.post1-cp38-abi3-manylinux1_x86_64.whl does not look like nightly wheel :( looks like it's the release wheel. I think pip discovers this wheel from my local cache because it exists and does not attempt to install the nightly :( Bug in pip? I manually went to https://wheels.vllm.ai/nightly/vllm and found this wheel https://wheels.vllm.ai/vllm-0.8.5.dev599%2Bg9fbf2bfbd-cp38-abi3-manylinux1_x86_64.whl - which indeed looks like nightly, but it gives 404 :(
$ pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://wheels.vllm.ai/nightly
Collecting vllm
Downloading vllm-0.8.5.post1-cp38-abi3-manylinux1_x86_64.whl (326.4 MB)
ββββββββββββββββββββββββββββββββββββββββ 326.4/326.4 MB 10.2 MB/s eta 0:00:00
...
Collecting torch==2.6.0
Downloading torch-2.6.0-cp310-cp310-manylinux1_x86_64.whl (766.7 MB)
ββββββββββββΈββββββββββββββββββββββββββββ 221.3/766.7 MB 252.7 MB/s eta 0:00:03
ERROR: Operation cancelled by user
# second try:
$ pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly/vllm
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://wheels.vllm.ai/nightly/vllm
Collecting vllm
Using cached vllm-0.8.5.post1-cp38-abi3-manylinux1_x86_64.whl (326.4 MB)
Basically, can't find direct URLs to nightly wheels :( which might be needed to circumvent pip not wanting to install the nightly for some reason
So far managed to find published commit from:
pip index versions vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
# and then
pip install -U vllm==0.8.5.dev600+g7ea6cb28b --pre --extra-index-url https://wheels.vllm.ai/nightly
HiοΌwhy upgrade to pytorch2.7?
I found that after this commit, the tpot/itl of the Qwen/Qwen2.5-14B-Instruct model on an h20 dropped from 20ms to 10ms. I want to know which part of the code benefited from this. Is it pytorch2.7?