Ke Wen
Ke Wen
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #138400 `all_gather_object` and `gather_object` have been tested in `test_c10d_nccl.py` and `test_c10d_object_collective.py`. Removing this third set. cc @XilunWu @H-Huang @awgu @wanchaol @fegin @fduwjj...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #137544 * __->__ #138384 * #138374 * #137855 Previously we only wait for comm to become ready after its initialization. But that's not...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #137544 * #138384 * #138374 * #137855 Resolves RFC https://github.com/pytorch/pytorch/issues/137007. Changelist: - Set default value of `nccl_use_nonblocking` to true (previous: false). cc...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #137544 * #138384 * __->__ #138374 * #137855 - Added default value for `nccl_nonblocking_timeout` (30 mins, previous: -1). - Reuse C10D_CHECK_TIMEOUT in other...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #141192 Adding `destroy_pg_upon_exit` property to allow derived Test classes to control whether auto destroy is desired. (Otherwise, derived test classes will need...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #141168 Pulling a PR to test viability. Today's timeout is 300s, which could waste quite some machine time if a hang happens...