IDM-VTON icon indicating copy to clipboard operation
IDM-VTON copied to clipboard

Minimum GPU memory requirements (inference)

Open tonySSAI opened this issue 1 year ago • 3 comments

Hi. I was running the inference code on a 12GB 3060GPU, but I always ran into CUDA_OUT_OF_MEMORY error no matter what I do. Following is my inference command.

accelerate launch inference.py --width 384 --height 512 --num_inference_steps 30 --output_dir "result" --unpaired \
--data_dir datasets --test_batch_size 1 --guidance_scale 2.0 --mixed_precision bf16 --enable_xformers_memory_efficient_attention

I would like to know the minimum requirements for GPU configuration for running training and inference code. Furthermore, is there any other way to decrease the memory requirements besides enabling Mixed Precision, decreasing resolution and using xformers?

tonySSAI avatar Apr 17 '24 09:04 tonySSAI

I'm running the inference code on a 24GB 4090GPU and I'm also getting the CUDA_OUT_OF_MEMORY error

GeneralWhite avatar Apr 19 '24 06:04 GeneralWhite

From the weights sizes and the code I have been reading it needs about 30GB to load the models, but maybe you can do cpu offload between the garment UNET and denoiser UNET while copying the condition attention maps in the GPU, which are the biggest models and make it work in 16GB~ VRAM GPUs. The denoiser UNET is about 10 GB and condition UNET 12GB.

vcadillog avatar Apr 22 '24 17:04 vcadillog

Hello, we used mixed precision(fp16) for inference. It should be worked with VRAM < 18GB on fp16. You can also try more optimizations like offloading @vcadillog mentioned for further reducing VRAM.

yisol avatar Apr 22 '24 19:04 yisol