StoryDiffusion icon indicating copy to clipboard operation
StoryDiffusion copied to clipboard

CUDA out of memory

Open xiaohui09 opened this issue 1 year ago • 7 comments

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB (GPU 0; 6.00 GiB total capacity; 5.07 GiB already allocated; 0 bytes free; 5.33 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

how can i do? I tried use below code to set GPU and batch_size, and clean cache,but still had this error: os.environ["CUDA_VISIABLE_DEVICES"] = "0" os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128" if hasattr(torch.cuda, "empty_cache"): torch.cuda.empty_cache()

xiaohui09 avatar May 06 '24 11:05 xiaohui09

We now add a low GPU Memory cost version, it was tested on a machine with 24GB GPU memory (Tesla A10) and 30GB RAM and is expected to work well with >20 G GPU memory.

python gradio_app_sdxl_specific_id_low_vram.py

Z-YuPeng avatar May 06 '24 16:05 Z-YuPeng

We now add a low GPU Memory cost version, it was tested on a machine with 24GB GPU memory (Tesla A10) and 30GB RAM and is expected to work well with >20 G GPU memory.

python gradio_app_sdxl_specific_id_low_vram.py

Low vram version is working very well on RTC 4090 with 24GB RAM. Very good work, team!

Speedway1 avatar May 06 '24 22:05 Speedway1

We now add a low GPU Memory cost version, it was tested on a machine with 24GB GPU memory (Tesla A10) and 30GB RAM and is expected to work well with >20 G GPU memory.

python gradio_app_sdxl_specific_id_low_vram.py

This version is incredible. So much faster. On a 4090 your first example 8 panel comic is now finished in only 59 seconds.

SoftologyPro avatar May 07 '24 00:05 SoftologyPro

We now add a low GPU Memory cost version, it was tested on a machine with 24GB GPU memory (Tesla A10) and 30GB RAM and is expected to work well with >20 G GPU memory.

python gradio_app_sdxl_specific_id_low_vram.py

Thanks for your help. Very good work!

xiaohui09 avatar May 07 '24 00:05 xiaohui09

One quick issue with the low VRAM script. use_safetensors is Falso so it does not auto-download the models. pipe = StableDiffusionXLPipeline.from_pretrained(sd_model_path, torch_dtype=torch.float16, use_safetensors = False)

SoftologyPro avatar May 07 '24 03:05 SoftologyPro

One quick issue with the low VRAM script. use_safetensors is Falso so it does not auto-download the models. pipe = StableDiffusionXLPipeline.from_pretrained(sd_model_path, torch_dtype=torch.float16, use_safetensors = False)

Thanks for your feedback. I changed the use_safetensors = False because the repo does not have safe tensor files, I think it will still download automatically. Have you observed does not auto-download?

https://huggingface.co/stablediffusionapi/sdxl-unstable-diffusers-y/tree/main/unet

Z-YuPeng avatar May 07 '24 03:05 Z-YuPeng

I load the gradio (low VRAM) click the first example, change the sd_type to Juggernaut and click Generate. Gives the error OSError: Could not find the necessary safetensors weights in {'text_encoder_2/pytorch_model.bin', 'text_encoder/pytorch_model.bin', 'unet/diffusion_pytorch_model.bin', 'vae/diffusion_pytorch_model.bin'} (variant=None) Just to be sure I then deleted the hub\models--RunDiffusion--Juggernaut-XL-v8 directory from my cache files and tried again. Same error. Then I cange line 450 to pipe = StableDiffusionXLPipeline.from_pretrained(sd_model_path, torch_dtype=torch.float16, use_safetensors = True) and try again. Same error. OSError: Could not find the necessary safetensors weights in {'vae/diffusion_pytorch_model.bin', 'text_encoder/pytorch_model.bin', 'text_encoder_2/pytorch_model.bin', 'unet/diffusion_pytorch_model.bin'} (variant=None) So it does not seem to work with Juggernauit whatever safetensor option is used.

SoftologyPro avatar May 07 '24 03:05 SoftologyPro

@SoftologyPro Hi, The repo of Juggernaut does not have safatensors, I have modified the code use_safetensors = False, and works.

Z-YuPeng avatar May 08 '24 06:05 Z-YuPeng

Currently, the latest code has been tested on all four models and will not report errors. Please download the latest code and run "python gradio_app_sdxl_specific_id_low_vram.py".

Z-YuPeng avatar May 08 '24 06:05 Z-YuPeng

It seems that there are some issues with the junglenaut weight, and it cannot generate normally. It has been urgently removed from the model list.

Z-YuPeng avatar May 08 '24 06:05 Z-YuPeng

OK, Juggernaut did download OK here when I just tested. The first example did get a poor result with Juggernaut... image

SoftologyPro avatar May 08 '24 06:05 SoftologyPro

@SoftologyPro You are right, Perhaps there was an error in loading the model's precision conversion for Juggernaut... I have found a solution and will update the code soon. https://huggingface.co/RunDiffusion/Juggernaut-XL-v9/discussions/6

Z-YuPeng avatar May 08 '24 06:05 Z-YuPeng

Juggernaut works now. Thanks.

image

One minor issue I suggested before. Add a 4th tip explianing that adding * to the end of a prompt hides the caption. Otherwise nobody knows how to stop the caption appearing if required.

Also a tip for setting your own captions after # symbol each prompt.

SoftologyPro avatar May 08 '24 23:05 SoftologyPro