Tron1994
Tron1994
config:https://github.com/hpcaitech/ColossalAI-Examples/blob/main/language/gpt/gpt2_configs/gpt2_zero3.py 报错: Traceback (most recent call last): File "train_gpt.py", line 149, in main() File "train_gpt.py", line 140, in main trainer.fit(train_dataloader=train_dataloader, File "/home/jovyan/miniconda3/envs/colossalai-test/lib/python3.8/site-packages/colossalai/trainer/_trainer.py", line 321, in fit self._train_epoch( File "/home/jovyan/miniconda3/envs/colossalai-test/lib/python3.8/site-packages/colossalai/trainer/_trainer.py", line...
### 🐛 Describe the bug use: gpt2_configs/gpt2_zero3.py run:Error: failed to run torchrun --nproc_per_node=8 --nnodes=1 --node_rank=0 --rdzv_backend=c10d --rdzv_endpoint=127.0.0.1:29500 --rdzv_id=colossalai-default-job train_gpt.py --config=gpt2_configs/gpt2_zero3.py --from_torch on 127.0.0.1 bug log: WARNING colossalai - ShardedOptimizerV2 -...