ColossalAI
ColossalAI copied to clipboard
[BUG]: Image Sampling not working.
π Describe the bug
The entire training process and everything worked, then i got through installing bitsandbytes, but as i try to sample i get an error message. I've now tried to use 512-base-ema.ckpt with the v2-v inference config but it still doesn't work. Here's the error log: ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /opt/conda/envs/pytorch/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda113.so
/opt/conda/envs/pytorch/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/nvidia/lib64'), PosixPath('/usr/local/nvidia/lib'), PosixPath('/opt/conda/envs/pytorch/lib/python3.8/site-packages/cv2/../../lib64')}
warn(msg)
/opt/conda/envs/pytorch/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: /opt/conda/envs/pytorch/lib/python3.8/site-packages/cv2/../../lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
/opt/conda/envs/pytorch/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0'), PosixPath('/usr/local/cuda/lib64/libcudart.so')}.. We'll flip a coin and try one of these, in order to fail forward.
Either way, this might cause trouble in the future:
If you get CUDA error: invalid device function
errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env.
warn(msg)
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 113
CUDA SETUP: Loading binary /opt/conda/envs/pytorch/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda113.so...
Global seed set to 42
Loading model from 512-base-ema.ckpt
Global Step: 875000
Traceback (most recent call last):
File "scripts/txt2img.py", line 305, in target
to instantiate.")
KeyError: 'Expected key target
to instantiate.'
Environment
No response
can you show us your script with more details?
Bot detected the issue body's language is not English, translate it automatically. π―ππ»π§βπ€βπ§π«π§πΏβπ€βπ§π»π©πΎβπ€βπ¨πΏπ¬πΏ
can you show us your script with more details?
Hello thanks for your reply. Once starting the docker container I've ran 2 things before trying to sample and they are: pip install -e .
and pip install bitsandbytes
Here's my command txt2img.sh: python scripts/txt2img.py --prompt "Teyvat, Medium Female, a woman in a blue outfit holding a sword" --plms \ --outdir ./output \ --ckpt 512-base-ema.ckpt \ --config configs/Inference/v2-inference-v.yaml \ --n_samples 4
Here's the output of pip freeze:
absl-py==1.3.0
aiohttp==3.8.3
aiosignal==1.3.1
albumentations==1.3.0
altair==4.2.0
antlr4-python3-runtime==4.8
anyio==3.6.2
apex==0.1
async-timeout==4.0.2
attrs==22.2.0
backports.zoneinfo==0.2.1
bcrypt==4.0.1
bitsandbytes==0.39.1
blinker==1.5
braceexpand==0.1.7
brotlipy==0.7.0
cachetools==5.2.0
certifi @ file:///croot/certifi_1665076670883/work/certifi
cffi @ file:///tmp/abs_98z5h56wf8/croots/recipe/cffi_1659598650955/work
cfgv==3.3.1
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click==8.1.3
coloredlogs==15.0.1
colossalai==0.1.12+torch1.12cu11.3
commonmark==0.9.1
contexttimer==0.3.3
contourpy==1.0.6
cryptography @ file:///tmp/build/80754af9/cryptography_1652083738073/work
cycler==0.11.0
datasets==2.8.0
decorator==5.1.1
diffusers==0.11.1
dill==0.3.6
distlib==0.3.6
einops==0.3.0
entrypoints==0.4
fabric==2.7.1
fastapi==0.88.0
ffmpy==0.3.0
filelock==3.9.0
flatbuffers==22.12.6
fonttools==4.38.0
frozenlist==1.3.3
fsspec==2022.11.0
ftfy==6.1.1
future==0.18.2
gitdb==4.0.10
GitPython==3.1.30
google-auth==2.15.0
google-auth-oauthlib==0.4.6
gradio==3.11.0
grpcio==1.51.1
h11==0.12.0
httpcore==0.15.0
httpx==0.23.1
huggingface-hub==0.11.1
humanfriendly==10.0
identify==2.5.11
idna @ file:///tmp/build/80754af9/idna_1637925883363/work
imageio==2.9.0
imageio-ffmpeg==0.4.2
importlib-metadata==5.2.0
importlib-resources==5.10.2
invisible-watermark==0.1.5
invoke==1.7.3
Jinja2==3.1.2
joblib==1.2.0
jsonschema==4.17.3
kiwisolver==1.4.4
# Editable install with no version control (latent-diffusion==0.0.1)
-e /workspace/examples/images/diffusion
lightning-utilities==0.5.0
linkify-it-py==1.0.3
Markdown==3.4.1
markdown-it-py==2.1.0
MarkupSafe==2.1.1
matplotlib==3.6.2
mdit-py-plugins==0.3.3
mdurl==0.1.2
mkl-fft==1.3.1
mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work
mkl-service==2.4.0
mpmath==1.2.1
multidict==6.0.4
multiprocess==0.70.14
networkx==2.8.8
nodeenv==1.7.0
numpy @ file:///tmp/abs_653_j00fmm/croots/recipe/numpy_and_numpy_base_1659432701727/work
oauthlib==3.2.2
omegaconf==2.1.1
onnx==1.13.0
onnxruntime==1.13.1
open-clip-torch==2.7.0
opencv-python==4.7.0.68
opencv-python-headless==4.7.0.68
orjson==3.8.3
packaging==21.3
pandas==1.5.2
paramiko==2.12.0
pathlib2==2.3.7.post1
Pillow==9.2.0
pkgutil_resolve_name==1.3.10
platformdirs==2.6.2
pre-commit==2.21.0
prefetch-generator==1.0.3
protobuf==3.20.3
psutil==5.9.4
pudb==2019.2
pyarrow==10.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pycryptodome==3.16.0
pydantic==1.10.3
pydeck==0.8.0
pydub==0.25.1
Pygments==2.13.0
Pympler==1.0.1
PyNaCl==1.5.0
pyOpenSSL @ file:///opt/conda/conda-bld/pyopenssl_1643788558760/work
pyparsing==3.0.9
pyrsistent==0.19.3
PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work
python-dateutil==2.8.2
python-multipart==0.0.5
pytorch-lightning @ file:///workspace/lightning
pytz==2022.7
pytz-deprecation-shim==0.1.0.post0
PyWavelets==1.4.1
PyYAML==6.0
qudida==0.0.4
regex==2022.10.31
requests @ file:///opt/conda/conda-bld/requests_1657734628632/work
requests-oauthlib==1.3.1
responses==0.18.0
rfc3986==1.5.0
rich==12.6.0
rsa==4.9
scikit-image==0.19.3
scikit-learn==1.2.0
scipy==1.9.3
semver==2.13.0
six @ file:///tmp/build/80754af9/six_1644875935023/work
smmap==5.0.0
sniffio==1.3.0
starlette==0.22.0
streamlit==1.16.0
sympy==1.11.1
tensorboard==2.11.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorboardX==2.5.1
test-tube==0.7.5
threadpoolctl==3.1.0
tifffile==2022.10.10
titans==0.0.7
tokenizers==0.12.1
toml==0.10.2
toolz==0.12.0
torch==1.12.0
torchaudio==0.12.0
torchmetrics==0.6.0
torchvision==0.13.0
tornado==6.2
tqdm==4.64.1
transformers==4.19.2
typing_extensions @ file:///tmp/abs_ben9emwtky/croots/recipe/typing_extensions_1659638822008/work
tzdata==2022.7
tzlocal==4.2
uc-micro-py==1.0.1
urllib3 @ file:///tmp/abs_5dhwnz6atv/croots/recipe/urllib3_1659110457909/work
urwid==2.1.2
uvicorn==0.20.0
validators==0.20.0
virtualenv==20.17.1
watchdog==2.2.0
wcwidth==0.2.5
webdataset==0.2.5
websockets==10.4
Werkzeug==2.2.2
xxhash==3.2.0
yarl==1.8.2
zipp==3.11.0
can you show us your script with more details?
Is this issue solved, wait your reply ASAP