ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

Problem with stable diffusion training

Open pinxuezhao opened this issue 3 years ago • 1 comments

Hello! Thanks for your cool repo! When I tried stable diffusion training with
python3 main.py --logdir /tmp -t --postfix test -b configs/train_colossalai_cifar10.yaml I found it failed just after starting the training loop: image image image I' m not sure if there are any details that I did wrong. Any suggestions will be highly appreciated!

pinxuezhao avatar Nov 30 '22 16:11 pinxuezhao

Try python3 main.py --logdir /tmp -t --postfix test -b configs/train_colossalai_cifar10.yaml --placement_policy cuda

It was likely due to a previous version of defaulting to auto placement, which often introduced tensor device errors.

JThh avatar Feb 01 '23 13:02 JThh

We have updated a lot. This issue was closed due to inactivity. Thanks.

binmakeswell avatar Apr 14 '23 08:04 binmakeswell