OOTDiffusion icon indicating copy to clipboard operation
OOTDiffusion copied to clipboard

torch.cuda.OutOfMemoryError: CUDA out of memory.

Open georgegeorgevan opened this issue 1 year ago • 9 comments

各位大神好,我分别用4090和A100 (40G版)运行这个项目,都遇到同样的报错,说是显存不够: text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["id2label"] will be overriden. text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["bos_token_id"] will be overriden. text_config_dict is provided which will be used to initialize CLIPTextConfig. The value text_config["eos_token_id"] will be overriden. 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.62s/it] 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:06<00:00, 6.23s/it] Initial seed: 1536610237 0%| | 0/20 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/autodl-tmp/.tr/ot/run/run_ootd.py", line 71, in images = model( File "/root/autodl-tmp/.tr/ot/ootd/inference_ootd_hd.py", line 121, in call images = self.pipe(prompt_embeds=prompt_embeds, File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/pipeline_ootd.py", line 373, in call noise_pred = self.unet_vton( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/unet_vton_2d_condition.py", line 1080, in forward sample, res_samples, spatial_attn_inputs, spatial_attn_idx = downsample_block( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/unet_vton_2d_blocks.py", line 1177, in forward hidden_states, spatial_attn_inputs, spatial_attn_idx = attn( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/transformer_vton_2d.py", line 383, in forward hidden_states, spatial_attn_inputs, spatial_attn_idx = block( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/.tr/ot/ootd/pipelines_ootd/attention_vton.py", line 266, in forward attn_output = self.attn1( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 522, in forward return self.processor( File "/root/autodl-tmp/conda3/ot/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 1231, in call hidden_states = F.scaled_dot_product_attention( torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.81 GiB (GPU 0; 39.59 GiB total capacity; 36.51 GiB already allocated; 1.81 GiB free; 37.26 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

请问可能是什么原因呢?还有,想问下大家,你们是用多大显存的GPU跑的?

georgegeorgevan avatar Apr 30 '24 09:04 georgegeorgevan

same problem

BigManstrj avatar May 03 '24 08:05 BigManstrj

I used to encounter this issue and it seems like loading and processing png files are computationally expensive, try using jpg or jpeg files and it should be sorted

PraNavKumAr01 avatar May 03 '24 15:05 PraNavKumAr01

Despite the resizing in run_ootd.py, the model and cloth images need to be 768x1024. When using arbitrary image resolutions it causes CUDA out-of-memory errors

ixarchakos avatar May 05 '24 07:05 ixarchakos

Same problem with A100 -40G. I try to use Xformers to solve it. It works. But my question is why the authors can run it successfully without Xformers

XinZhang0526 avatar May 15 '24 03:05 XinZhang0526

XinZhang0526

Can you please share your python version and requirements.txt, please? I installed xformers but encounter the same error. Thanks

Borismartirosyan avatar Jun 17 '24 21:06 Borismartirosyan

Same problem with A100 -40G. I try to use Xformers to solve it. It works. But my question is why the authors can run it successfully without Xformers

how to add xformers in code, could you show more specific code

Nomination-NRB avatar Jun 27 '24 12:06 Nomination-NRB

@XinZhang0526 Please could you share pip list Need to see what version of xformers work

nitinmukesh avatar Jul 17 '24 15:07 nitinmukesh

@nitinmukesh @Borismartirosyan In fact, here is my version of xformers xformers == 0.0.22 torch ==1.13.1+cu116

unet_vton.enable_xformers_memory_efficient_attention() unet_garm.enable_xformers_memory_efficient_attention()

By the way, torch >= 2.0 is recommended.

XinZhang0526 avatar Jul 18 '24 02:07 XinZhang0526

其实这是我的 xformers 版本 xformers == 0.0.22 火炬 ==1.13.1+cu116

unet_vton.enable_xformers_memory_efficient_attention() unet_garm.enable_xformers_memory_efficient_attention()

顺便说一句,建议使用 torch >= 2.0。

请问这两行代码要添加到哪个文件中

2681248863 avatar Oct 18 '24 15:10 2681248863