GenPromp icon indicating copy to clipboard operation
GenPromp copied to clipboard

RuntimeError: CUDA out of memory.

Open sab148 opened this issue 1 year ago • 1 comments

Hello,

When i run

python main.py --function test --config configs/cub_stage2.yml --opt "{'test': {'load_token_path': 'ckpts/cub983/tokens/', 'load_unet_path': 'ckpts/cub983/unet/', 'save_log_path': 'ckpts/cub983/log.txt'}}”

I am encountering this error
Traceback (most recent call last): File "/p/project/atmlaml/benassou1/ega/GenPromp/main.py", line 646, in <module> eval(args.function)(config) File "/p/project/atmlaml/benassou1/ega/GenPromp/main.py", line 300, in test noise_pred = unet(noisy_latents, timesteps, combine_embeddings).sample File "/p/software/juwelsbooster/stages/2023/software/PyTorch/1.12.0-foss-2022a-CUDA-11.7/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/p/project/atmlaml/benassou1/ega/GenPromp/sc_venv_template/venv/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 615, in forward sample = upsample_block( File "/p/software/juwelsbooster/stages/2023/software/PyTorch/1.12.0-foss-2022a-CUDA-11.7/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/p/project/atmlaml/benassou1/ega/GenPromp/sc_venv_template/venv/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 1813, in forward hidden_states = attn( File "/p/software/juwelsbooster/stages/2023/software/PyTorch/1.12.0-foss-2022a-CUDA-11.7/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/p/project/atmlaml/benassou1/ega/GenPromp/sc_venv_template/venv/lib/python3.10/site-packages/diffusers/models/transformer_2d.py", line 265, in forward hidden_states = block( File "/p/software/juwelsbooster/stages/2023/software/PyTorch/1.12.0-foss-2022a-CUDA-11.7/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/p/project/atmlaml/benassou1/ega/GenPromp/sc_venv_template/venv/lib/python3.10/site-packages/diffusers/models/attention.py", line 321, in forward ff_output = self.ff(norm_hidden_states) File "/p/software/juwelsbooster/stages/2023/software/PyTorch/1.12.0-foss-2022a-CUDA-11.7/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/p/project/atmlaml/benassou1/ega/GenPromp/sc_venv_template/venv/lib/python3.10/site-packages/diffusers/models/attention.py", line 379, in forward hidden_states = module(hidden_states) File "/p/software/juwelsbooster/stages/2023/software/PyTorch/1.12.0-foss-2022a-CUDA-11.7/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/p/software/juwelsbooster/stages/2023/software/PyTorch/1.12.0-foss-2022a-CUDA-11.7/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 39.56 GiB total capacity; 7.06 GiB already allocated; 1.94 MiB free; 17.07 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I changed the batch size to 1, reduced the size of the image, max_split_size_mb, and still does not work. Could you please help me to fix this problem ?

sab148 avatar Dec 12 '23 23:12 sab148

The code can run on RTX 3090 with memory of 24GB, it seems that your machine is running out of memory (only 7.06 GB already allocated). You can free up more memory and try again.

callsys avatar Dec 12 '23 23:12 callsys