OOTDiffusion torch.cuda.OutOfMemoryError: CUDA out of memory.

各位大神好，我分别用4090和A100 (40G版)运行这个项目，都遇到同样的报错，说是显存不够： text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["id2label"] will be overriden. text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["bos_token_id"] will be overriden. text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["eos_token_id"] will be overriden. 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.62s/it] 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:06<00:00, 6.23s/it] Initial seed: 1536610237 0%| | 0/20 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/autodl-tmp/.tr/ot/run/run_ootd.py", line 71, in images = model( File "/root/autodl-tmp/.tr/ot/ootd/inference_ootd_hd.py", line 121, in call images = self.pipe(prompt_embeds=prompt_embeds, File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/pipeline_ootd.py", line 373, in call noise_pred = self.unet_vton( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/unet_vton_2d_condition.py", line 1080, in forward sample, res_samples, spatial_attn_inputs, spatial_attn_idx = downsample_block( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/unet_vton_2d_blocks.py", line 1177, in forward hidden_states, spatial_attn_inputs, spatial_attn_idx = attn( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/transformer_vton_2d.py", line 383, in forward hidden_states, spatial_attn_inputs, spatial_attn_idx = block( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/attention_vton.py", line 266, in forward attn_output = self.attn1( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 522, in forward return self.processor( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 1231, in call hidden_states = F.scaled_dot_product_attention( torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.81 GiB (GPU 0; 39.59 GiB total capacity; 36.51 GiB already allocated; 1.81 GiB free; 37.26 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

请问可能是什么原因呢？还有，想问下大家，你们是用多大显存的GPU跑的？

Apr 30 '24 09:04 georgegeorgevan

same problem

May 03 '24 08:05 BigManstrj

I used to encounter this issue and it seems like loading and processing png files are computationally expensive, try using jpg or jpeg files and it should be sorted

May 03 '24 15:05 PraNavKumAr01

Despite the resizing in run_ootd.py, the model and cloth images need to be 768x1024. When using arbitrary image resolutions it causes CUDA out-of-memory errors

May 05 '24 07:05 ixarchakos

Same problem with A100 -40G. I try to use Xformers to solve it. It works. But my question is why the authors can run it successfully without Xformers

May 15 '24 03:05 XinZhang0526

XinZhang0526

Can you please share your python version and requirements.txt, please? I installed xformers but encounter the same error. Thanks

Jun 17 '24 21:06 Borismartirosyan

Same problem with A100 -40G. I try to use Xformers to solve it. It works. But my question is why the authors can run it successfully without Xformers

how to add xformers in code, could you show more specific code

Jun 27 '24 12:06 Nomination-NRB

@XinZhang0526 Please could you share pip list Need to see what version of xformers work

Jul 17 '24 15:07 nitinmukesh

@nitinmukesh @Borismartirosyan In fact, here is my version of xformers xformers == 0.0.22 torch ==1.13.1+cu116

unet_vton.enable_xformers_memory_efficient_attention() unet_garm.enable_xformers_memory_efficient_attention()

By the way, torch >= 2.0 is recommended.

Jul 18 '24 02:07 XinZhang0526

其实这是我的 xformers 版本 xformers == 0.0.22 火炬 ==1.13.1+cu116

unet_vton.enable_xformers_memory_efficient_attention（） unet_garm.enable_xformers_memory_efficient_attention（）

顺便说一句，建议使用 torch >= 2.0。

请问这两行代码要添加到哪个文件中

Oct 18 '24 15:10 2681248863