ColossalAI
ColossalAI copied to clipboard
[DOC]: why all examples are running at one node?
📚 The doc issue
is there any examples is running at multi node?
This is somehow limited to our dev environment. But we are definitely supportive of multi-node training. You need to set -nnodes=x for running, and make sure your nodes are managed by the nccl backend.
Actually, there was an issue regarding this: #2921.