vllm icon indicating copy to clipboard operation
vllm copied to clipboard

ARM aarch-64 server build failed (host OS: Ubuntu22.04.3)

Open zhudy opened this issue 1 year ago • 20 comments

do as: https://docs.vllm.ai/en/latest/getting_started/installation.html

  1. docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:23.10-py3
  2. git clone https://github.com/vllm-project/vllm.git
  3. cd vllm
  4. pip install -e .

here is the details in side the docker instance: root@f8c2e06fbf8b:/mnt/vllm# pip install -e . Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Obtaining file:///mnt/vllm Installing build dependencies ... done Checking if build backend supports build_editable ... done Getting requirements to build editable ... error error: subprocess-exited-with-error

× Getting requirements to build editable did not run successfully. │ exit code: 1 ╰─> [22 lines of output] /tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.) device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' :142: UserWarning: Unsupported CUDA/ROCM architectures ({'6.1', '7.2', '8.7', '5.2', '6.0'}) are excluded from the TORCH_CUDA_ARCH_LIST env variable (5.2 6.0 6.1 7.0 7.2 7.5 8.0 8.6 8.7 9.0+PTX). Supported CUDA/ROCM architectures are: {'7.5', '8.0', '9.0', '7.0', '8.6+PTX', '9.0+PTX', '8.6', '8.0+PTX', '8.9+PTX', '8.9', '7.0+PTX', '7.5+PTX'}. Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in main() File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 132, in get_requires_for_build_editable return hook(config_settings) File "/tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 441, in get_requires_for_build_editable return self.get_requires_for_build_wheel(config_settings) File "/tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel return self._get_build_requires(config_settings, requirements=['wheel']) File "/tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 295, in _get_build_requires self.run_setup() File "/tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File "", line 297, in File "", line 267, in get_vllm_version NameError: name 'nvcc_cuda_version' is not defined. Did you mean: 'cuda_version'? [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

× Getting requirements to build editable did not run successfully. │ exit code: 1 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

[notice] A new release of pip is available: 23.2.1 -> 23.3.1 [notice] To update, run: python -m pip install --upgrade pip

zhudy avatar Dec 11 '23 14:12 zhudy

Actually, the nvcc is ok to run as these:

root@f8c2e06fbf8b:/mnt/vllm# nvcc -v nvcc fatal : No input files specified; use option --help for more information root@f8c2e06fbf8b:/mnt/vllm# nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Aug_15_22:10:07_PDT_2023 Cuda compilation tools, release 12.2, V12.2.140 Build cuda_12.2.r12.2/compiler.33191640_0

zhudy avatar Dec 11 '23 14:12 zhudy

there is cuda:

root@f8c2e06fbf8b:/mnt/vllm# echo $CUDA_HOME /usr/local/cuda

root@f8c2e06fbf8b:/mnt/vllm# type nvcc nvcc is /usr/local/cuda/bin/nvcc

github.com/vllm# python3 -c "import torch; print(torch.cuda.is_available()); print(torch.version);" True 2.1.0a0+32f93b1

zhudy avatar Dec 11 '23 14:12 zhudy

add

nvcc_cuda_version = get_nvcc_cuda_version(CUDA_HOME) 

to setup.py at line 268

yexing avatar Dec 13 '23 09:12 yexing

@yexing @zhudy
Excuse me. I face the same problem. I cloned vllm into my project. and add nvcc_cuda_version = get_nvcc_cuda_version(CUDA_HOME) to setup.py at line 268

But still have same problem. Did I mislead something?

cyc00518 avatar Feb 22 '24 05:02 cyc00518

I have the same problem and would be glad if there would be any help. Setup: Aarch64 GH200 OS: Ubuntu 22.04.3 LTS (Jammy Jellyfish) nvcc: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Wed_Nov_22_11:03:34_PST_2023 Cuda compilation tools, release 12.3, V12.3.107 Build cuda_12.3.r12.3/compiler.33567101_0 cuda home: /usr/local/cuda Torch: 2.2.0a0+81ea7a4

I am running inside the nvidia pytorch_23.12 Container.

Wetzr avatar Mar 04 '24 08:03 Wetzr

Got it working with the changes in this branch: https://github.com/haileyschoelkopf/vllm/tree/aarm64-dockerfile , with built dockerfiles here: https://hub.docker.com/r/haileysch/vllm-aarch64-base https://hub.docker.com/r/haileysch/vllm-aarch64-openai hopefully this'll be helpful to others!

haileyschoelkopf avatar Mar 06 '24 13:03 haileyschoelkopf

do as: https://docs.vllm.ai/en/latest/getting_started/installation.html

  1. docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:23.10-py3
  2. git clone https://github.com/vllm-project/vllm.git
  3. cd vllm
  4. pip install -e .

here is the details in side the docker instance: root@f8c2e06fbf8b:/mnt/vllm# pip install -e . Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Obtaining file:///mnt/vllm Installing build dependencies ... done Checking if build backend supports build_editable ... done Getting requirements to build editable ... error error: subprocess-exited-with-error

× Getting requirements to build editable did not run successfully. │ exit code: 1 ╰─> [22 lines of output] /tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.) device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'), No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' :142: UserWarning: Unsupported CUDA/ROCM architectures ({'6.1', '7.2', '8.7', '5.2', '6.0'}) are excluded from the TORCH_CUDA_ARCH_LIST env variable (5.2 6.0 6.1 7.0 7.2 7.5 8.0 8.6 8.7 9.0+PTX). Supported CUDA/ROCM architectures are: {'7.5', '8.0', '9.0', '7.0', '8.6+PTX', '9.0+PTX', '8.6', '8.0+PTX', '8.9+PTX', '8.9', '7.0+PTX', '7.5+PTX'}. Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in main() File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 132, in get_requires_for_build_editable return hook(config_settings) File "/tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 441, in get_requires_for_build_editable return self.get_requires_for_build_wheel(config_settings) File "/tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel return self._get_build_requires(config_settings, requirements=['wheel']) File "/tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 295, in _get_build_requires self.run_setup() File "/tmp/pip-build-env-4xoxai9j/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File "", line 297, in File "", line 267, in get_vllm_version NameError: name 'nvcc_cuda_version' is not defined. Did you mean: 'cuda_version'? [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

× Getting requirements to build editable did not run successfully. │ exit code: 1 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

[notice] A new release of pip is available: 23.2.1 -> 23.3.1 [notice] To update, run: python -m pip install --upgrade pip

HI, guys , had you solved the issue ?

tuanhe avatar Mar 29 '24 02:03 tuanhe