[Llama3] Error when multiple GPUs are used

Open pgmpablo157321 opened this issue 1 year ago • 0 comments

The following issues appear when running the LLM reference implementation. Multiple GPUs issue:

(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231] Exception in worker VllmWorkerProcess while processing method init_device: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method, Traceback (most recent call last):
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 224, in _run_worker_process
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231]     output = executor(*args, **kwargs)
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/vllm/worker/worker.py", line 166, in init_device
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231]     torch.cuda.set_device(self.device)
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 420, in set_device
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231]     torch._C._cuda_setDevice(device)
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 300, in _lazy_init
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231]     raise RuntimeError(
(VllmWorkerProcess pid=1795) ERROR 12-03 18:49:03 multiproc_worker_utils.py:231] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Dec 03 '24 22:12 pgmpablo157321