torch-ccl icon indicating copy to clipboard operation
torch-ccl copied to clipboard

reduce_scatter raises a RuntimeError

Open garrett361 opened this issue 8 months ago • 0 comments

Hello, cross posting from ipex #647: torch-ccl does not support torch.distributed.reduce_scatter, despite the claims in the docs.

For instance, in 2.1.300+xpu we have: https://github.com/intel/torch-ccl/blob/b9ce71371fdb11f980befaa9d49a36a3c2c6e82b/src/ProcessGroupCCL.cpp#L871-L877 where the TORCH_CHECK line raises a RuntimeError.

See the ipex ticket for more details and code to reproduce the error.

garrett361 avatar Jun 05 '24 14:06 garrett361