ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[BUG]: ChatGPT: Incorrect description of multi-GPU support in DDP and ColossalAI strategies

Open snowyday opened this issue 1 year ago • 0 comments

🐛 Describe the bug

In the Examples at ChatGTP, the current description of DDP and ColossalAI strategies support for multi-GPU training is incorrect. The current description shows the following commands:

run ColossalAI on 2 GPUs
torchrun --standalone --nproc_per_node=2 train_dummy.py --strategy colossalai

However, --strategy colossalai is not available.

https://github.com/hpcaitech/ColossalAI/blob/35c8f4ce479e7dc7aab59e03bf00cba2d777ddb0/applications/ChatGPT/examples/train_dummy.py#L27

The correct commands for running ColossalAI on 2 GPUs should be:

run ColossalAI on 2 GPUs
torchrun --standalone --nproc_per_node=2 train_dummy.py --strategy colossalai_gemini

or

torchrun --standalone --nproc_per_node=2 train_dummy.py --strategy colossalai_zero2

Please update Examples accordingly.

Environment

ColossalAI main branch

snowyday avatar Mar 06 '23 00:03 snowyday