ColossalAI
ColossalAI copied to clipboard
[BUG]: name 'ckpt' is not defined && name 'trainer' is not defined
🐛 Describe the bug
run:python main.py --logdir /tmp/ --train --base configs/train_colossalai.yaml
error:
name 'ckpt' is not defined
name 'trainer' is not defined
Environment
conda3
Hi @winktool Which example are you using? We need more details to track the details. Thanks.
https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion
Thanks~
1、put --ckpt 512-base-ema.ckpt
into command can solve name 'ckpt' is not defined
2、delete https://github.com/hpcaitech/ColossalAI/blob/main/examples/images/diffusion/main.py#L823 can solve name 'trainer' is not defined
3、but i got new error:RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
should update opencv-python version in requirements.txt because of error No matching distribution found for opencv-python==4.6.0
Hi @winktool Sorry for the late reply. We have updated a lot. Please check the lastest code. https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion This issue was closed due to inactivity. Thanks.
Thanks for your question. The previous error on open-cv version has been fixed. In terms of your new error, the error message suggests that PyTorch was unable to read a zip archive, possibly because the zip file is corrupted or the file path is incorrect. Check that the path to the zip archive is correct. Make sure that the path is spelled correctly and that the file exists at that location. If this does not solve your problem, please make sure that the zip file is not corrupted and your pytorch version is up to date. (Since your related information is limited, the error could come out from several reasons)