FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Failed to set multiple gpus

Open Halflifefa opened this issue 2 years ago • 1 comments

When using parallel computing with multiple GPUs, if I set "--num-gpus 4," only three GPUs are actually activated. How can I solve this issue?

Halflifefa avatar May 23 '23 01:05 Halflifefa

@Halflifefa set max-gpu-memory, and provide the max memory you can allocate. Also, use another argument, --gpus 0,1,2,3.

i think the later should work better

iRanadheer avatar May 31 '23 19:05 iRanadheer

@Halflifefa this has to do with the model you are using. The model "spills" from one gpu when the memory is full to the next. If you use a LLaMa2-70, for example, it spills to 7*24gb gpus, for example. This is not a bug, so I will close this one. If you feel that it's not solved for you, you are always welcome to reopen!

surak avatar Oct 23 '23 08:10 surak