Qingyang Wu issues

Repositories
Issues
Comments

Results 2 issues of


                                            Qingyang Wu

Repetitive processing in reddit extractor

https://github.com/microsoft/DialoGPT/blob/b85558dea5391f83b20120d6c93b9f79fcc72311/reddit_extractor/src/reddit.py#L108-L112

Cannot resume FSDP optimizer state

This line does not save optimizer state correctly when using FSDP. https://github.com/huggingface/transformers/blob/88399476c3892435395618ed37993176dbb0de73/src/transformers/trainer.py#L2383 It should use FSDP's full_optim_state_dict to collect optimizer states from different processes. ```python FSDP.full_optim_state_dict(self.model, self.optimizer) ```