LOMO icon indicating copy to clipboard operation
LOMO copied to clipboard

What is the difference from official PyTorch DDP hooks?

Open wangkuiyi opened this issue 1 year ago • 1 comments

It is a classical idea to overlap the backward pass and the optimization step. PyTorch supports this overlapping in DDP and FSDP. For example, here are hooks in DDP https://github.com/pytorch/pytorch/tree/main/torch/distributed/algorithms/ddp_comm_hooks

How does this project (https://arxiv.org/pdf/2306.09782.pdf) differ from https://arxiv.org/pdf/2306.09782.pdf? Thanks.

wangkuiyi avatar Jun 20 '23 22:06 wangkuiyi

Thanks for your information, we will investigate it. Inplace updating is a classical engineering trick, and our intuition is to provide a solution for low-resource training. Our paper also discusses why SGD might be a good choice for LLM finetuning, how to stabilize the training process, and other analyses.

QipengGuo avatar Jun 21 '23 01:06 QipengGuo