Yuanheng Zhao
Yuanheng Zhao
For subsequent comments from reviewers, please fix and revise in another PR and merge the PR into your feature branch `feat/online-serving`
Hey @puppet101 , This seems to be an NCCL communication error. Would you like to first try setting `NCCL_SOCKET_IFNAME` to adjust specific interface usage? For example: ```bash export NCCL_SOCKET_IFNAME=eth ```...
Closed as the issue has been inactive for over a month. Please let us know if there exist any further issues.
Hey @hejianle , From the error log, it seems that you were using `torch>=2.0` and `colossalai