vllm icon indicating copy to clipboard operation
vllm copied to clipboard

Do not initialize process group when using a single GPU

Open WoosukKwon opened this issue 1 year ago • 2 comments

Currently we call torch.distributed.init_process_group even for a single GPU. This is redundant and causes errors when the LLM object is created multiple times.

WoosukKwon avatar May 22 '23 01:05 WoosukKwon

The same error occurs when creating two LLMs:

LLM(model="facebook/opt-125m")
LLM(model="facebook/opt-125m")  # RuntimeError: trying to initialize the default process group twice!

WoosukKwon avatar Jun 06 '23 00:06 WoosukKwon

Just commenting to upvote this. I'm encountering the same issue when I run two experiments in a row within the same program (even if the old LLM object is out of scope by the time the second one is initialized).

neubig avatar Jun 28 '23 14:06 neubig

I get this problem after a single call to LLM, and even if only one GPU is visible.

gburachas avatar Jul 12 '23 02:07 gburachas

Closing because a single worker will now only us Ray if the user specifies --worker-use-ray

hmellor avatar Mar 08 '24 10:03 hmellor