[BUG]: AttributeError: module 'torch.distributed' has no attribute '_reduce_scatter_base'
🐛 Describe the bug
When I run the code from https://github.com/hpcaitech/ColossalAI-Examples/tree/main/image/resnet, I got errors
AttributeError: module 'torch.distributed' has no attribute 'reduce_scatter_base
then I annotated the code in colossalai/communication/collective.py guided by online
_all_gather_func = dist._all_gather_base
if "all_gather_into_tensor" not in dir(dist) else dist.all_gather_into_tensor
_reduce_scatter_func = dist._reduce_scatter_base
if "reduce_scatter_tensor" not in dir(dist) else dist.reduce_scatter_tensor
got the error
ModuleNotFoundError: No module named 'torch.fx._compatibility'
Environment
python 3.6 torch 1.9.1+cu102 gtx3090
Thank you for the notification. It is supposed to be supported by torch>=1.10. We will improve the compatibility as soon as possible.
Hi @CreamyLong Please check the env reqiurement. https://github.com/hpcaitech/ColossalAI#installation This issue was closed due to inactivity. Thanks.