SadTalker icon indicating copy to clipboard operation
SadTalker copied to clipboard

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.53 GiB

Open cosmiclantern opened this issue 1 year ago • 2 comments

Hi, I'm running the extension in Auto1111 from colab, so can't change any startup flags.

It runs through all the frames for half an hour, then says out of memory and doesn't write the video file.

Can you tell me how I can fix this?

/content/gdrive/MyDrive/sd/stable-diffusion-webui/extensions/SadTalker/checkpoints/auido2pose_00140-model.pth /content/gdrive/MyDrive/sd/stable-diffusion-webui/extensions/SadTalker/checkpoints/shape_predictor_68_face_landmarks.dat Downloading: "https://github.com/xinntao/facexlib/releases/download/v0.1.0/alignment_WFLW_4HG.pth" to /usr/local/lib/python3.9/dist-packages/facexlib/weights/alignment_WFLW_4HG.pth

100% 185M/185M [00:02<00:00, 82.5MB/s] /content/gdrive/MyDrive/sd/stable-diffusion-webui/extensions/SadTalker/checkpoints/facevid2vid_00189-model.pth.tar /tmp/tmpsrsgb4d_.png landmark Det:100% 1/1 [00:10<00:00, 10.31s/it] 3DMM Extraction In Video:100% 1/1 [00:00<00:00, 2.44it/s] mel:100% 6177/6177 [00:00<00:00, 57243.45it/s] audio2exp:100% 618/618 [00:01<00:00, 447.95it/s] Face Renderer:100% 3089/3089 [27:43<00:00, 1.86it/s]

Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 394, in run_predict output = await app.get_blocks().process_api( File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1075, in process_api result = await self.call_function( File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 884, in call_function prediction = await anyio.to_thread.run_sync( File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/call_queue.py", line 15, in f res = func(*args, **kwargs) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/extensions/SadTalker/src/gradio_demo.py", line 122, in test return_path = self.animate_from_coeff.generate(data, save_dir, pic_path, crop_info, enhancer='gfpgan' if use_enhancer else None, preprocess=preprocess) File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/extensions/SadTalker/src/facerender/animate.py", line 146, in generate predictions_video = make_animation(source_image, source_semantics, target_semantics, File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/extensions/SadTalker/src/facerender/modules/make_animation.py", line 138, in make_animation predictions_ts = torch.stack(predictions, dim=1) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.53 GiB (GPU 0; 14.75 GiB total capacity; 8.96 GiB already allocated; 4.22 GiB free; 9.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

cosmiclantern avatar Apr 13 '23 00:04 cosmiclantern

Hi, what is the size of your image?

vinthony avatar Apr 14 '23 05:04 vinthony

Hi, sorry it's taken so long to reply. The image I was using was 1511 x 2467, 6.66MB png file. Is that too large?

Thanks for replying.

cosmiclantern avatar Apr 17 '23 17:04 cosmiclantern

I encountered the same error like this one, OOM is caused by torch.stack in make_animation.py. A workaround is to split the audio file to chunks, run inference one by one, then merge all results.

hemslo avatar May 11 '23 13:05 hemslo

You may refer to the OOM errors of this document: https://github.com/OpenTalker/SadTalker/blob/d3b2727b4afb4944b9185715d868d01e200e831a/docs/FAQ.md

vinthony avatar May 11 '23 17:05 vinthony

@vinthony that one doesn't work for OOM in this step, too many tensors stack at once. It's about 1GB per minute for new allocations.

hemslo avatar May 11 '23 17:05 hemslo