CUDA OOM
python run_ootd.py --model_path ../assets/m1.png --cloth_path ../assets/t1.png --scale 2.0 --sample 4
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.75 GiB (GPU 0; 23.68 GiB total capacity; 20.97 GiB already allocated; 315.50 MiB free; 21.17 GiB reserved in total by PyTorch)
Can't run with a 24gb memory device?
It costs less than 20GB memory on my RTX4090 for 4 samples. Could you try it again?
I am using the same device as yours, an running the command as showing above, the problem persist no matter how many times I try.
Please check your conda environment, like torch and diffusers versions. Different versions may cause different memory usage.
torch 2.0.1 diffusers 0.24.0
can I ask which version of diffusers you're using?
Do you mind to post your pip requirement.txt file to this repo?
torch 2.0.1 diffusers 0.24.0
can I ask which version of diffusers you're using?
Same. Are you running on linux? I have mentioned the required environments in readme. Will add a requirement.txt later.
Yep, I am on linux.
My pip list showing below:
absl-py 2.1.0
accelerate 0.26.1
addict 2.4.0
aiofiles 23.2.1
altair 5.2.0
annotated-types 0.6.0
anyio 4.3.0
attrs 23.2.0
basicsr 1.4.2
cachetools 5.3.2
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
cmake 3.28.3
colorama 0.4.6
coloredlogs 15.0.1
contourpy 1.2.0
cycler 0.12.1
diffusers 0.24.0
exceptiongroup 1.2.0
fastapi 0.109.2
ffmpy 0.3.2
filelock 3.13.1
flatbuffers 23.5.26
fonttools 4.49.0
fsspec 2024.2.0
future 0.18.3
google-auth 2.28.0
google-auth-oauthlib 1.0.0
gradio 4.16.0
gradio_client 0.8.1
grpcio 1.60.1
h11 0.14.0
httpcore 1.0.3
httpx 0.26.0
huggingface-hub 0.20.3
humanfriendly 10.0
idna 3.6
imageio 2.34.0
importlib-metadata 7.0.1
importlib-resources 6.1.1
Jinja2 3.1.3
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
lazy_loader 0.3
lit 17.0.6
lmdb 1.4.1
Markdown 3.5.2
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.7.4
mdurl 0.1.2
mpmath 1.3.0
networkx 3.2.1
ninja 1.11.1.1
numpy 1.24.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.2.10.91
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusparse-cu11 11.7.4.91
nvidia-nccl-cu11 2.14.3
nvidia-nvtx-cu11 11.7.91
oauthlib 3.2.2
onnxruntime 1.17.0
opencv-python 4.7.0.72
orjson 3.9.14
packaging 23.2
pandas 2.2.0
Pillow 9.4.0
pip 23.3.1
platformdirs 4.2.0
protobuf 4.25.3
psutil 5.9.8
pyasn1 0.5.1
pyasn1-modules 0.3.0
pydantic 2.6.1
pydantic_core 2.16.2
pydub 0.25.1
Pygments 2.17.2
pyparsing 3.1.1
python-dateutil 2.8.2
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
referencing 0.33.0
regex 2023.12.25
requests 2.31.0
requests-oauthlib 1.3.1
rich 13.7.0
rpds-py 0.18.0
rsa 4.9
ruff 0.2.2
safetensors 0.4.2
scikit-image 0.22.0
scipy 1.12.0
semantic-version 2.10.0
setuptools 68.2.2
shellingham 1.5.4
six 1.16.0
sniffio 1.3.0
starlette 0.36.3
sympy 1.12
tb-nightly 2.14.0a20230808
tensorboard-data-server 0.7.2
tifffile 2024.2.12
tokenizers 0.15.2
tomli 2.0.1
tomlkit 0.12.0
toolz 0.12.1
torch 2.0.1
torchvision 0.15.2
tqdm 4.64.1
transformers 4.36.2
triton 2.0.0
typer 0.9.0
typing_extensions 4.9.0
tzdata 2024.1
urllib3 2.2.1
uvicorn 0.27.1
websockets 11.0.3
Werkzeug 3.0.1
wheel 0.41.2
yapf 0.40.2
zipp 3.17.0
Maybe you can try this pip list, some versions differ from those in my repo though:
absl-py 2.0.0
accelerate 0.24.1
aiofiles 23.2.1
altair 5.2.0
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 4.0.0
attrs 23.2.0
av 11.0.0
black 23.11.0
cachetools 5.3.2
carvekit 4.5.2
certifi 2023.7.22
charset-normalizer 3.3.2
clean-fid 0.1.35
click 8.1.7
cloudpickle 3.0.0
cmake 3.27.7
colorama 0.4.6
coloredlogs 15.0.1
config 0.5.1
contourpy 1.1.1
cupy-cuda11x 11.1.0
cycler 0.12.1
detectron2 0.6
detectron2-densepose 0.6
diffusers 0.23.0
einops 0.7.0
exceptiongroup 1.1.3
fastapi 0.100.1
fastrlock 0.8.2
ffmpy 0.3.1
filelock 3.13.1
flatbuffers 23.5.26
fonttools 4.44.0
fsspec 2023.10.0
fvcore 0.1.5.post20221221
google-auth 2.23.4
google-auth-oauthlib 1.0.0
gradio 4.16.0
gradio-client 0.8.1
grpcio 1.59.2
h11 0.14.0
httpcore 1.0.2
httpx 0.26.0
huggingface-hub 0.20.3
humanfriendly 10.0
hydra-core 1.3.2
idna 3.4
imageio 2.32.0
importlib-metadata 6.8.0
importlib-resources 6.1.1
iopath 0.1.9
Jinja2 3.1.2
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
lazy-loader 0.3
lightning-utilities 0.10.1
lit 17.0.4
loguru 0.6.0
lpips 0.1.4
Markdown 3.5.1
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.7.3
mdurl 0.1.2
mkl-fft 1.2.0
mkl-random 1.1.1
mkl-service 2.3.0
mpmath 1.3.0
mypy-extensions 1.0.0
networkx 3.1
numpy 1.24.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.2.10.91
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusparse-cu11 11.7.4.91
nvidia-nccl-cu11 2.14.3
nvidia-nvtx-cu11 11.7.91
oauthlib 3.2.2
olefile 0.46
omegaconf 2.3.0
onnxruntime 1.16.2
opencv-python 4.7.0.72
opencv-python-headless 4.8.1.78
orjson 3.9.12
packaging 23.2
pandas 2.0.3
pathspec 0.11.2
Pillow 9.4.0
pip 20.3.3
pkgutil-resolve-name 1.3.10
platformdirs 4.0.0
portalocker 2.8.2
protobuf 4.25.0
psutil 5.9.6
pyasn1 0.5.0
pyasn1-modules 0.3.0
pycocotools 2.0.7
pydantic 2.1.1
pydantic-core 2.4.0
pydub 0.25.1
pygments 2.17.2
pyparsing 3.1.1
python-dateutil 2.8.2
python-multipart 0.0.6
pytz 2023.4
PyWavelets 1.4.1
PyYAML 6.0.1
referencing 0.33.0
regex 2023.10.3
requests 2.31.0
requests-oauthlib 1.3.1
rich 13.7.0
rpds-py 0.17.1
rsa 4.9
ruff 0.1.15
safetensors 0.4.0
scikit-image 0.21.0
scipy 1.10.1
semantic-version 2.10.0
setuptools 65.5.1
shellingham 1.5.4
six 1.15.0
sniffio 1.3.0
starlette 0.27.0
sympy 1.12
tabulate 0.9.0
tensorboard 2.14.0
tensorboard-data-server 0.7.2
termcolor 2.3.0
tifffile 2023.7.10
timm 0.9.10
tokenizers 0.15.0
tomli 2.0.1
tomlkit 0.12.0
toolz 0.12.1
torch 2.0.1
torch-fidelity 0.3.0
torchaudio 2.0.2
torchmetrics 0.11.4
torchvision 0.15.2
tqdm 4.64.1
transformers 4.35.2
triton 2.0.0
typer 0.9.0
typing 3.7.4.3
typing-extensions 4.8.0
tzdata 2023.4
urllib3 2.1.0
uvicorn 0.23.2
websockets 11.0.3
werkzeug 3.0.1
wheel 0.36.2
xformers 0.0.21
yacs 0.1.8
zipp 3.17.0
I installed the packages strictly follow the list you provided, but the same problem persists :(
T_T weird. Only costs 20 GB for 6 samples for me...
Is it possible that the cuda version and the cudnn version are the problem?
I'm using cuda11.8 and cudnn8.6
same problem. OOM even only 1 sample
Solved by installing the xformers :-)
@levihsu dear autor, you may update the pip install list to the readme. It is a great project, thanks for making it open source.
Solved by installing the xformers :-)
@levihsu dear autor, you may update the pip install list to the readme. It is a great project, thanks for making it open source.
I have installed the xformers, the issue still exist. Can I know which version of xformers you're using?
How did you solve this specifically?
@ChengsongLu just pip install xformers, and rerun my xformers version is 0.0.24+cu118
My pip list showing below, some of them may be useless, since I just copy the conda environment from my another environment rather than create a new environment:
Package Version
---------------------------- --------------------
absl-py 2.0.0
accelerate 0.21.0
addict 2.4.0
aenum 3.1.15
aiofiles 23.2.1
aiohttp 3.9.1
aiosignal 1.3.1
altair 5.2.0
annotated-types 0.5.0
antlr4-python3-runtime 4.9.3
anyio 3.7.1
async-timeout 4.0.3
attrs 23.1.0
basicsr 1.4.2
beautifulsoup4 4.12.2
bidict 0.22.1
blendmodes 2022
boltons 23.0.0
cachetools 5.3.2
certifi 2023.11.17
cffi 1.16.0
chardet 5.2.0
charset-normalizer 3.3.2
clean-fid 0.1.35
click 8.1.7
clip 1.0
cmake 3.28.1
colorama 0.4.6
coloredlogs 15.0.1
contourpy 1.2.0
controlnet-aux 0.0.3
coremltools 6.3.0
cssselect2 0.7.0
cycler 0.12.1
dctorch 0.1.2
deprecation 2.1.0
diffusers 0.24.0
einops 0.4.1
entrypoints 0.4
exceptiongroup 1.2.0
facexlib 0.3.0
fastapi 0.94.0
ffmpy 0.3.1
filelock 3.13.1
filterpy 1.4.5
Flask 2.2.3
Flask-Cors 4.0.0
Flask-SocketIO 5.3.6
flaskwebgui 0.3.5
flatbuffers 23.5.26
font-roboto 0.0.1
fonts 0.0.3
fonttools 4.47.0
frozenlist 1.4.1
fsspec 2023.12.2
ftfy 6.1.3
future 0.18.3
fvcore 0.1.5.post20221221
gdown 4.7.1
gfpgan 1.3.8
gitdb 4.0.11
GitPython 3.1.32
google-ai-generativelanguage 0.4.0
google-api-core 2.16.2
google-auth 2.25.2
google-auth-oauthlib 1.2.0
google-generativeai 0.3.2
googleapis-common-protos 1.62.0
gradio 3.39.0
gradio_client 0.3.0
grpcio 1.60.1
grpcio-status 1.60.1
h11 0.12.0
h5py 3.8.0
hickle 5.0.2
httpcore 0.15.0
httpx 0.24.1
huggingface-hub 0.19.4
humanfriendly 10.0
idna 3.6
imageio 2.33.1
importlib-metadata 7.0.0
importlib-resources 6.1.1
imutils 0.5.4
inflection 0.5.1
iopath 0.1.9
itsdangerous 2.1.2
Jinja2 3.1.2
jsonmerge 1.8.0
jsonschema 4.20.0
jsonschema-specifications 2023.11.2
kiwisolver 1.4.5
kornia 0.6.7
lark 1.1.2
lazy_loader 0.3
lightning-utilities 0.10.0
linkify-it-py 2.0.0
lit 17.0.6
llvmlite 0.41.1
lmdb 1.4.1
loguru 0.7.2
lora 0.3.0
lpips 0.1.4
lxml 4.9.4
Markdown 3.5.1
markdown-it-py 2.2.0
MarkupSafe 2.1.3
matplotlib 3.8.2
mdit-py-plugins 0.3.3
mdurl 0.1.2
mediapipe 0.10.9
mmengine 0.7.2
model-index 0.1.11
mpmath 1.3.0
multidict 6.0.4
mypy-extensions 1.0.0
networkx 3.2.1
ninja 1.11.1.1
numba 0.58.1
numpy 1.23.5
nvidia-cublas-cu11 11.11.3.6
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu11 11.8.87
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu11 11.8.89
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu11 11.8.89
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu11 8.7.0.84
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu11 10.9.0.58
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu11 10.3.0.86
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu11 11.4.1.48
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu11 11.7.5.86
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu11 2.19.3
nvidia-nccl-cu12 2.19.3
nvidia-nvjitlink-cu12 12.3.101
nvidia-nvtx-cu11 11.8.86
nvidia-nvtx-cu12 12.1.105
oauthlib 3.2.2
omegaconf 2.2.3
open-clip-torch 2.20.0
openai-clip 1.0.1
opencv-contrib-python 4.8.1.78
opencv-python 4.8.1.78
opencv-python-headless 4.8.1.78
ordered-set 4.1.0
orjson 3.9.10
packaging 23.2
pandas 2.1.4
piexif 1.1.3
Pillow 9.5.0
pip 23.0.1
platformdirs 4.1.0
portalocker 2.8.2
progressbar2 4.2.0
proto-plus 1.23.0
protobuf 4.25.2
psutil 5.9.5
py-cpuinfo 9.0.0
pyasn1 0.5.1
pyasn1-modules 0.3.0
pycocotools 2.0.6
pycparser 2.21
pydantic 1.10.13
pydantic_core 2.1.2
pydub 0.25.1
Pygments 2.17.2
pyparsing 3.1.1
pyre-extensions 0.0.23
pyrsistent 0.19.3
PySocks 1.7.1
python-dateutil 2.8.2
python-engineio 4.8.0
python-multipart 0.0.6
python-socketio 5.10.0
python-utils 3.5.2
pytorch-lightning 1.9.4
pytz 2023.3.post1
PyWavelets 1.5.0
PyYAML 6.0.1
realesrgan 0.3.0
referencing 0.32.0
regex 2023.10.3
reportlab 4.0.8
requests 2.31.0
requests-oauthlib 1.3.1
resize-right 0.0.2
rich 13.7.0
rpds-py 0.15.2
rsa 4.9
safetensors 0.4.1
scikit-image 0.21.0
scipy 1.11.4
seaborn 0.13.0
segment-anything 1.0
segmentation-refinement 0.6
semantic-version 2.10.0
sentencepiece 0.1.99
setuptools 66.0.0
simple-websocket 1.0.0
six 1.16.0
smmap 5.0.1
sniffio 1.3.0
sounddevice 0.4.6
soupsieve 2.5
starlette 0.26.1
supervision 0.17.1
svglib 1.5.1
sympy 1.12
tabulate 0.9.0
tb-nightly 2.16.0a20231219
tensorboard-data-server 0.7.2
tensorboard-plugin-wit 1.8.1
termcolor 2.4.0
tf-keras-nightly 2.16.0.dev2023121910
thinplate 1.0.0
thop 0.1.1.post2209072238
threadpoolctl 3.1.0
tifffile 2023.12.9
timm 0.9.2
tinycss2 1.2.1
tokenizers 0.15.1
tomesd 0.1.3
tomli 2.0.1
toolz 0.12.0
torch 2.2.0+cu118
torchdiffeq 0.2.3
torchmetrics 1.2.1
torchsde 0.2.6
torchvision 0.17.0+cu118
tqdm 4.66.1
trampoline 0.1.2
transformers 4.38.0.dev0
triton 2.2.0
typing_extensions 4.9.0
typing-inspect 0.8.0
tzdata 2023.3
uc-micro-py 1.0.1
ultralytics 8.0.228
urllib3 2.1.0
uvicorn 0.24.0.post1
wcwidth 0.2.12
webencodings 0.5.1
websockets 11.0.3
Werkzeug 2.2.2
wheel 0.38.4
whichcraft 0.6.1
wsproto 1.2.0
xformers 0.0.24+cu118
yacs 0.1.8
yapf 0.40.2
yarl 1.9.4
zipp 3.17.0
I have everything installed, but still have CUDA OOM error with 24 GB GPU. (ootd) boris_martirosyan@boris-martirosyan:/home/jupyter/trials/OOTDiffusion$ python run/run_ootd.py --model_path "/home/jupyter/trials/OOTDiffusion/data/models/Screenshot 2024-06-17 at 21.58.53.png" --cloth_path "/home/jupyter/trials/OOTDiffusion/data/clothes/Screenshot 2024-06-17 at 22.00.50.png" --scale 2.0 --sample 4
/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/torchvision/image.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 7.77it/s]
/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["id2label"] will be overriden.
text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["bos_token_id"] will be overriden.
text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["eos_token_id"] will be overriden.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.78it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.07it/s]
Initial seed: 1060560947
/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/diffusers/models/lora.py:358: UserWarning: Plan failed with an OutOfMemoryError: CUDA out of memory. Tried to allocate 7.50 GiB. GPU (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:924.)
return F.conv2d(
/home/boris_martirosyan/.conda/envs/ootd/lib/python3.10/site-packages/diffusers/models/lora.py:358: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv2d(
Traceback (most recent call last):
File "/home/jupyter/trials/OOTDiffusion/run/run_ootd.py", line 71, in