OpenMusic icon indicating copy to clipboard operation
OpenMusic copied to clipboard

A30 with 24G Memory is not enough?

Open 8600862 opened this issue 1 year ago • 2 comments

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 23.53 GiB of which 6.06 MiB is free.

8600862 avatar Sep 26 '24 07:09 8600862

Yeah, the memory needed is very close to 24GB. You can try to move the VAE, Hifi-gan, Flan-t5 part on CPU, leave MDT part on GPU. If still occurs memory limit exceed error, I will update my code for 24GB infernece this weekend~

ivcylc avatar Sep 26 '24 07:09 ivcylc

Yeah, the memory needed is very close to 24GB. You can try to move the VAE, Hifi-gan, Flan-t5 part on CPU, leave MDT part on GPU. If still occurs memory limit exceed error, I will update my code for 24GB infernece this weekend~

thanks

8600862 avatar Sep 26 '24 08:09 8600862

will I be able to run python gradio/gradio_app.py on a nvidia rtx 4080 super ? no extra training, just the bare minimum

for me gpu usage worked until 20%, then stopped (no more gpu usage), but a cpu core remained at 100% for quite a while now. No error thrown. In task manager gpu memory is at 15.7/16 GB, RAM is 54/128 GB, and inside WSL with 34GB free out of 64

....
Running DDIM Sampling with 200 timesteps
DDIM Sampler:   0%|                                                                                                                                                                                        | 0/200 [00:00<?, ?it/s]The input shape to the diffusion model is as follows:
xc torch.Size([3, 8, 256, 16])
t torch.Size([3])
context_0 torch.Size([3, 1, 1024]) torch.Size([3, 1])
DDIM Sampler:  20%|███████████████████████████████████▍   

running on python 3.10 conda env with all requirements and pip conflicts solved. running in wsl fedora remix on windows 11

"You can try to move the VAE, Hifi-gan, Flan-t5 part on CPU, leave MDT part on GPU." how can i do this ? thank you

in event viewer i found: nvlddmkm \Device\00000148 Error occurred on GPUID: 900

but no error during running script