ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[DOC]: why all examples are running at one node?

Open LarryZhangy opened this issue 2 years ago • 2 comments

📚 The doc issue

is there any examples is running at multi node?

LarryZhangy avatar Apr 24 '23 08:04 LarryZhangy

This is somehow limited to our dev environment. But we are definitely supportive of multi-node training. You need to set -nnodes=x for running, and make sure your nodes are managed by the nccl backend.

JThh avatar Apr 24 '23 15:04 JThh

Actually, there was an issue regarding this: #2921.

JThh avatar Apr 24 '23 15:04 JThh