vllm
vllm copied to clipboard
Do not initialize process group when using a single GPU
Currently we call torch.distributed.init_process_group
even for a single GPU. This is redundant and causes errors when the LLM object is created multiple times.
The same error occurs when creating two LLMs:
LLM(model="facebook/opt-125m")
LLM(model="facebook/opt-125m") # RuntimeError: trying to initialize the default process group twice!
Just commenting to upvote this. I'm encountering the same issue when I run two experiments in a row within the same program (even if the old LLM object is out of scope by the time the second one is initialized).
I get this problem after a single call to LLM, and even if only one GPU is visible.
Closing because a single worker will now only us Ray if the user specifies --worker-use-ray