Yifu Wang
Yifu Wang
@pytorchbot merge
@ad8e just a heads up there was a regression in fused optimizers in the past few days. https://github.com/pytorch/pytorch/pull/123566 should fix it. > we only get 67% occupancy per fused adam...
@pytorchbot merge
@pytorcbot merge
@pytorchmergebot
@pytorchbot merge
Thanks for reporting @hbikki. You mentioned in the aio-libs issue that "when reading/writing to S3 with process count > 5 for versions 2.4.2". Curious if you had success with other...
> Is there a way to gather all the weights on CPU and then dump or extract embedding tables, sharded layer weights, gather those and then dump the tables on...