MuseTalk icon indicating copy to clipboard operation
MuseTalk copied to clipboard

Got Cuda error while creating new avatar from mp4 video

Open shekharmeena2896 opened this issue 8 months ago • 2 comments

hi I tried creating new avatar using the personal video and audio, I changed the path in realtime.yaml this below is the yaml I changed avator_1: preparation: True # your can set it to False if you want to use the existing avator, it will save time bbox_shift: 5 video_path: "data/video/preview_video_talk_3.mp4" audio_clips: audio_0: "data/audio/yongen.wav" audio_1: "data/audio/ash_eng.mp3"

this is the error I got :

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 162/162 [00:01<00:00, 81.40it/s] get key_landmark and face bounding boxes with the default value 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 162/162 [00:19<00:00, 8.13it/s] bbox_shift parameter adjustment************** Total frame:「162」 Manually adjust range : [ -10~9 ] , the current value: 0


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 324/324 [00:30<00:00, 10.76it/s] Inferring using: data/audio/yongen.wav start inference processing audio:data/audio/yongen.wav costs 1942.0125484466553ms 200 0%| | 0/10 [00:03<?, ?it/s] Traceback (most recent call last): File "/root/anaconda3/envs/MuseTalk/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/anaconda3/envs/MuseTalk/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/ubuntu/musetalker/MuseTalk/scripts/realtime_inference.py", line 405, in avatar.inference(audio_path, File "/home/ubuntu/musetalker/MuseTalk/scripts/realtime_inference.py", line 278, in inference recon = vae.decode_latents(pred_latents) File "/home/ubuntu/musetalker/MuseTalk/musetalk/models/vae.py", line 103, in decode_latents image = self.vae.decode(latents.to(self.vae.dtype)).sample File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper return method(self, *args, **kwargs) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/diffusers/models/autoencoders/autoencoder_kl.py", line 321, in decode decoded = self._decode(z).sample File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/diffusers/models/autoencoders/autoencoder_kl.py", line 292, in _decode dec = self.decoder(z) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/diffusers/models/autoencoders/vae.py", line 337, in forward sample = up_block(sample, latent_embeds) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/diffusers/models/unets/unet_2d_blocks.py", line 2750, in forward hidden_states = upsampler(hidden_states) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/diffusers/models/upsampling.py", line 180, in forward hidden_states = self.conv(hidden_states) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "/root/anaconda3/envs/MuseTalk/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 640.00 MiB (GPU 0; 15.77 GiB total capacity; 13.41 GiB already allocated; 628.19 MiB free; 13.77 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CON

shekharmeena2896 avatar Apr 15 '25 09:04 shekharmeena2896

Hi @shekharmeena2896,

You can try to set this argument to a lower value : )

zzzweakman avatar Apr 17 '25 09:04 zzzweakman