vllm
vllm copied to clipboard
Publish wheels with pre-built CUDA binaries
Currently, pip installing our package takes 5-10 minutes because our CUDA kernels are compiled on the user machine. For better UX, we should include pre-built CUDA binaries in our PyPI distribution, just like PyTorch and xformers.
Yes, agree! I encounter below compatibility issue when trying to install vllm. RuntimeError: GPUs with compute capability less than 7.0 are not supported.
Details can be found:
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [15 lines of output]
Traceback (most recent call last):
File "/home/pai/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/home/pai/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/pai/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
File "/tmp/pip-build-env-fa8_sj_e/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 341, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
File "/tmp/pip-build-env-fa8_sj_e/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 323, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-fa8_sj_e/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 338, in run_setup
exec(code, locals())
File "<string>", line 48, in <module>
RuntimeError: GPUs with compute capability less than 7.0 are not supported.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
This would be super helpful. We are in a peotected environment (thanks, IT!) Where we can only install cuda via conda. Conda cuda does not come with cuda.h
because of nvidia licensing terms, so vllm installation fails. Having pre built wheel would allow the library to be used for everyone who installs cuda via conda (e.g. having two different version of cuda in two virtual environments)
@andreapiso In the latest release and henceforth, vLLM is published with pre-built CUDA binaries. Please try out pip install vllm
and let us know if it does not work for you.