GaLore
GaLore copied to clipboard
How many GB memory is required to train the 7b model using DDP mode with galore?
in sigle gpu mode,I success run the train by RTX3090.but it took too long。 in ddp mode,we got OOM in LlamaForCausalLM = torch.nn.parallel.DistributedDataParallel( model, device_ids=[local_rank], output_device=local_rank, broadcast_buffers=False, ) .