2DPASS icon indicating copy to clipboard operation
2DPASS copied to clipboard

Muti-GPU train issue

Open LiXiang0021 opened this issue 1 year ago • 1 comments

Have you ever run into this issue?

It works well when I use one or two gups to train with batch_size =1 or 2. However, it will be killed when I use three or four GPUs with batch_size 3 or 4. given that the per GPU memory is around 12G.

I don't know if I forget to set any parameters.

Can anyone do me a favor, if you met this before?

Thanks!

LiXiang0021 avatar Mar 26 '23 07:03 LiXiang0021

Maybe you cans share what gpu you used and give the details of the erros message.

fengjiang5 avatar May 19 '23 05:05 fengjiang5