ColossalAI
ColossalAI copied to clipboard
Problem with stable diffusion training
Hello! Thanks for your cool repo!
When I tried stable diffusion training with
python3 main.py --logdir /tmp -t --postfix test -b configs/train_colossalai_cifar10.yaml
I found it failed just after starting the training loop:
I' m not sure if there are any details that I did wrong.
Any suggestions will be highly appreciated!
Try python3 main.py --logdir /tmp -t --postfix test -b configs/train_colossalai_cifar10.yaml --placement_policy cuda
It was likely due to a previous version of defaulting to auto placement, which often introduced tensor device errors.
We have updated a lot. This issue was closed due to inactivity. Thanks.