Iris Z

Results 34 comments of Iris Z

@pytorchmergebot merge -f "unrelated test failures https://github.com/pytorch/pytorch/actions/runs/4008358242/jobs/6882601831"

@pytorchmergebot merge -f "unrelated xla, functorch, dynamo, crossref test failures"

Closing as we cannot repro this issue.

> We discussed offline that when training 'for real' on a cluster, the auto-restart behavior would be messed up if the load path points to another folder, so we need...

> If we put in special logic for `numel() == 1`, what about the case of `numel() < nranks`? For this case specifically, there is no sharding involved. This is...