VGen icon indicating copy to clipboard operation
VGen copied to clipboard

cuda out of memory

Open 8600862 opened this issue 1 year ago • 8 comments

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 880.00 MiB (GPU 0; 23.70 GiB total capacity; 20.08 GiB already allocated; 602.56 MiB free; 21.64 GiB reserved in total by PyTorch)

I using 3090 with 24g memory, so how many memory did it need

8600862 avatar Dec 18 '23 06:12 8600862

Currently, we have only developed and validated on A100. At the moment, we are preparing machines with V100 and A10, and we hope to eventually be compatible with these three types of GPU cards.

Steven-SWZhang avatar Dec 18 '23 07:12 Steven-SWZhang

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 880.00 MiB (GPU 0; 23.70 GiB total capacity; 20.08 GiB already allocated; 602.56 MiB free; 21.64 GiB reserved in total by PyTorch)

I using 3090 with 24g memory, so how many memory did it need

try model.half() before load the model to gpu, especially unet, worked for me(took aboout 17G).

ryanmafrty avatar Dec 26 '23 06:12 ryanmafrty

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 880.00 MiB (GPU 0; 23.70 GiB total capacity; 20.08 GiB already allocated; 602.56 MiB free; 21.64 GiB reserved in total by PyTorch) I using 3090 with 24g memory, so how many memory did it need

try model.half() before load the model to gpu, especially unet, worked for me(took aboout 17G).

Where should I write this code?

JS-an avatar Dec 29 '23 09:12 JS-an

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 880.00 MiB (GPU 0; 23.70 GiB total capacity; 20.08 GiB already allocated; 602.56 MiB free; 21.64 GiB reserved in total by PyTorch) I using 3090 with 24g memory, so how many memory did it need

try model.half() before load the model to gpu, especially unet, worked for me(took aboout 17G).

Where should I write this code?

in my version I modified tools/inferences/inference_i2vgen_entrance.py line 138:

  • model = model.to(gpu)
  • model = model.half().to(gpu)

ryanmafrty avatar Dec 29 '23 09:12 ryanmafrty

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 880.00 MiB (GPU 0; 23.70 GiB total capacity; 20.08 GiB already allocated; 602.56 MiB free; 21.64 GiB reserved in total by PyTorch) I using 3090 with 24g memory, so how many memory did it need

try model.half() before load the model to gpu, especially unet, worked for me(took aboout 17G).

Where should I write this code?

in my version I modified tools/inferences/inference_i2vgen_entrance.py line 138:

  • model = model.to(gpu)

  • model = model.half().to(gpu)

have u tried the new gradio_app.py,how can we save the GPU VRAM?

frankchieng avatar Jan 11 '24 10:01 frankchieng

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 880.00 MiB (GPU 0; 23.70 GiB total capacity; 20.08 GiB already allocated; 602.56 MiB free; 21.64 GiB reserved in total by PyTorch) I using 3090 with 24g memory, so how many memory did it need

try model.half() before load the model to gpu, especially unet, worked for me(took aboout 17G).

Where should I write this code?

in my version I modified tools/inferences/inference_i2vgen_entrance.py line 138:

  • model = model.to(gpu)
  • model = model.half().to(gpu)

have u tried the new gradio_app.py,how can we save the GPU VRAM?

not following this anymore, but should be the same, just find the real model and half it befor load to gpu. hornestly it should be fixed by ali long ago.

ryanmafrty avatar Jan 15 '24 06:01 ryanmafrty

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 880.00 MiB (GPU 0; 23.70 GiB total capacity; 20.08 GiB already allocated; 602.56 MiB free; 21.64 GiB reserved in total by PyTorch) I using 3090 with 24g memory, so how many memory did it need

try model.half() before load the model to gpu, especially unet, worked for me(took aboout 17G).

Where should I write this code?

in my version I modified tools/inferences/inference_i2vgen_entrance.py line 138:

  • model = model.to(gpu)
  • model = model.half().to(gpu)

have u tried the new gradio_app.py,how can we save the GPU VRAM?

not following this anymore, but should be the same, just find the real model and half it befor load to gpu. hornestly it should be fixed by ali long ago.

thx,problem solved,set up cog environment like Docker,then in gradio run cog predict,i can successfully run this project on 4090 locally with gradio interface

frankchieng avatar Jan 17 '24 10:01 frankchieng

I copied the code from inference and replaced modelscope's model loading in gradio.py. So simply adjust it yourself.

DoubleCake avatar Feb 01 '24 03:02 DoubleCake