Chien-Chin Huang
Chien-Chin Huang
During this step, only rank0 is doing some reduction work of the plans. But I'm surprised it will be slow enough to cause the NCCL timeout. Verifying with a large...
@pytorchbot merge
@pytorchbot merge -f "The failing tests are not related"
@pytorchbot merge -f "The failing tests are not related."
@pytorchbot merge -f "The failing tests are not related."
Will land this PR to unblock the use case of NamedOptimizer. This will also include the support of `NamedOptimizer` + `use_orig_params=False`. We can remove the part to support `use_orig_param=False` later...
The failing tests are not related.
@pytorchbot merge -f "The failing tests are not related"
@pytorchbot merge