stanford_alpaca Is anyone using a single A100 80GB for training?

Is anyone using a single A100 80GB for training?

Open Ahtesham00 opened this issue 1 year ago • 2 comments

Apr 12 '23 20:04 Ahtesham00

I also tryed to finetune this model using a single A100 gpu, but failed! I met the error: "ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0 (pid: 564788) of binary". I guess we may need to change the code from distributed data parallel mode to single gpu mode.

Apr 21 '23 06:04 Somezak1

Not enough GRAM

Apr 25 '23 10:04 zhsu-private

stanford_alpaca stanford_alpaca copied to clipboard

Is anyone using a single A100 80GB for training?

stanford_alpaca
stanford_alpaca copied to clipboard