vllm icon indicating copy to clipboard operation
vllm copied to clipboard

unable to run vllm model deployment

Open riyajatar37003 opened this issue 1 year ago • 18 comments

Your current environment

Failed to import from vllm._C with ImportError("/usr/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_C.abi3.so)")

INFO 07-16 09:29:50 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager (VllmWorkerProcess pid=658) INFO 07-16 09:29:52 multiproc_worker_utils.py:215] Worker ready; awaiting tasks (VllmWorkerProcess pid=656) INFO 07-16 09:29:52 multiproc_worker_utils.py:215] Worker ready; awaiting tasks (VllmWorkerProcess pid=657) INFO 07-16 09:29:53 multiproc_worker_utils.py:215] Worker ready; awaiting tasks INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2 (VllmWorkerProcess pid=656) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2 (VllmWorkerProcess pid=658) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2 INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5 (VllmWorkerProcess pid=657) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2 (VllmWorkerProcess pid=656) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5 (VllmWorkerProcess pid=658) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5 (VllmWorkerProcess pid=657) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5 (VllmWorkerProcess pid=658) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB (VllmWorkerProcess pid=656) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB (VllmWorkerProcess pid=657) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm' ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm' ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm. ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ . (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm. ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ . (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm. ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm. (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ . ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ . (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last): (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last): ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last): (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, **kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run() ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run() (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run() (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable( (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable( ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable( (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches, ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches, ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches, ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states, (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states, (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states, ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(*args, **kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(*args, **kwargs) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm( (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm( (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm( ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(*args, **kwargs) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(*args, **kwargs) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError( ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError( (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError( ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm' (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] [rank0]: Traceback (most recent call last): [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/runpy.py", line 196, in _run_module_as_main [rank0]: return _run_code(code, main_globals, None, [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/runpy.py", line 86, in _run_code [rank0]: exec(code, run_globals) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 282, in [rank0]: run_server(args) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 224, in run_server [rank0]: if llm_engine is not None else AsyncLLMEngine.from_engine_args( [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 444, in from_engine_args [rank0]: engine = cls( [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 373, in init [rank0]: self.engine = self._init_engine(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 520, in _init_engine [rank0]: return engine_class(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 263, in init [rank0]: self._initialize_kv_caches() [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 362, in _initialize_kv_caches [rank0]: self.model_executor.determine_num_available_blocks()) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/distributed_gpu_executor.py", line 38, in determine_num_available_blocks [rank0]: num_blocks = self._run_workers("determine_num_available_blocks", ) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_gpu_executor.py", line 135, in _run_workers [rank0]: driver_worker_output = driver_worker_method(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context [rank0]: return func(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks [rank0]: self.model_runner.profile_run() [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context [rank0]: return func(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run [rank0]: self.execute_model(model_input, kv_caches, intermediate_tensors) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context [rank0]: return func(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model [rank0]: hidden_or_intermediate_states = model_executable( [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward [rank0]: hidden_states = self.model(input_ids, positions, kv_caches, [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward [rank0]: hidden_states, residual = layer(positions, hidden_states, [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward [rank0]: hidden_states = self.input_layernorm(hidden_states) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward [rank0]: return self._forward_method(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda [rank0]: ops.rms_norm( [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper [rank0]: raise e [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper [rank0]: return fn(*args, **kwargs) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm [rank0]: torch.ops._C.rms_norm(out, input, weight, epsilon) [rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr [rank0]: raise AttributeError( [rank0]: AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm' ERROR 07-16 09:31:46 multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 658 died, exit code: -15 INFO 07-16 09:31:46 multiproc_worker_utils.py:123] Killing local vLLM worker processes /tmp/.conda/envs/vllm_env/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

🐛 Describe the bug

tried to install using pip install vllm

riyajatar37003 avatar Jul 16 '24 07:07 riyajatar37003

The same error when running any model. Install VLLM via pip directly.

yumaofan avatar Jul 16 '24 08:07 yumaofan

did the same only

riyajatar37003 avatar Jul 16 '24 08:07 riyajatar37003

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

wheresmyhair avatar Jul 16 '24 09:07 wheresmyhair

i am trying graphrag with vllm deployed model but i am getting this error

ERROR 07-16 12:08:18 api_server.py:247] Error in applying chat template from request: Conversation roles must alternate user/assistant/user/assistant/...

riyajatar37003 avatar Jul 16 '24 10:07 riyajatar37003

Same issue. And vllm 0.5.1 works well.

JaheimLee avatar Jul 16 '24 12:07 JaheimLee

Same issue.

WMeng1 avatar Jul 18 '24 03:07 WMeng1

look at https://github.com/vllm-project/vllm/issues/6462#issuecomment-2234006925 it resolves my iisue

vlsav avatar Jul 18 '24 05:07 vlsav

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

works well!!

rzes avatar Jul 18 '24 10:07 rzes

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

encounter the same issue and can confirm this works for me, too

zichaow avatar Jul 19 '24 20:07 zichaow

hello , any one find any solution about this problem?

AlexBlack2202 avatar Jul 22 '24 04:07 AlexBlack2202

hello , any one find any solution about this problem?

https://github.com/vllm-project/vllm/issues/6464#issuecomment-2235595670

vlsav avatar Jul 22 '24 05:07 vlsav

Delete the directory named "vllm" resolves my issue. I find the method from this comment https://github.com/vllm-project/vllm/issues/1814#issuecomment-1837122930

heya5 avatar Jul 22 '24 07:07 heya5

0.5.4 same error

lonngxiang avatar Aug 12 '24 02:08 lonngxiang

why the source build have so many problem, i meet the same error.. Have it fix

DreamerZhang11 avatar Sep 05 '24 11:09 DreamerZhang11

how to change the version of vllm when we are installing it from source build

yashwanth125 avatar Oct 16 '24 03:10 yashwanth125

#!pip install vllm==0.5.4 #!pip install git+https://github.com/huggingface/transformers #!pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu121 #!pip uninstall pynvml -y #!pip install nvidia-ml-py

Works for me. You may have to re-compile flash_attn after all.

Steve

thusinh1969 avatar Nov 16 '24 16:11 thusinh1969

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions[bot] avatar Feb 15 '25 01:02 github-actions[bot]

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!

github-actions[bot] avatar Mar 17 '25 02:03 github-actions[bot]