Open-Sora icon indicating copy to clipboard operation
Open-Sora copied to clipboard

torchrun --nnodes=1 --nproc_per_node=1 scripts/train.py报错

Open oceanogeology opened this issue 10 months ago • 1 comments

执行单卡训练的时候报错 booster.backward(loss=loss, optimizer=optimizer) File "/root/anaconda3/envs/opensora/lib/python3.10/site-packages/colossalai/booster/booster.py", line 167, in backward optimizer.backward(loss) File "/root/anaconda3/envs/opensora/lib/python3.10/site-packages/colossalai/zero/low_level/low_level_optim.py", line 487, in backward loss.backward(retain_graph=retain_graph) File "/root/anaconda3/envs/opensora/lib/python3.10/site-packages/torch/_tensor.py", line 522, in backward torch.autograd.backward( File "/root/anaconda3/envs/opensora/lib/python3.10/site-packages/torch/autograd/init.py", line 266, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: GET was unable to find an engine to execute this computation

oceanogeology avatar Apr 03 '24 13:04 oceanogeology