e5-mistral-7b-instruct icon indicating copy to clipboard operation
e5-mistral-7b-instruct copied to clipboard

OOM with 2 GPUs (48GB in total)

Open yurinoviello opened this issue 1 year ago • 0 comments

Hi, i do not understand why the execution fails when I use 2 GPUs, however with a single one it works.

I tried the following setups:

  1. Deafult configuration (i just changed gradient_accumulation_steps to 1), it works
  2. Default configuration (gradient_accumulation_steps=1 and num_processes=2), torch.cuda.OutOfMemoryError

I am using docker, hardware: 2xL4 (24GBx2)./

yurinoviello avatar Mar 06 '24 16:03 yurinoviello