ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

Getting out of memory during execution of SVD workflow

Open rybandrei2014 opened this issue 10 months ago • 1 comments

Hello,

I am getting out of memory exception when trying to run following SVD workflow:

Capture

workflow(5).json

Here is the output from command line

got prompt
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Building a Downsample layer with 2 dims.
  --> settings are:
 in-chn: 320, out-chn: 320, kernel-size: 3, stride: 2, padding: 1
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Building a Downsample layer with 2 dims.
  --> settings are:
 in-chn: 640, out-chn: 640, kernel-size: 3, stride: 2, padding: 1
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Building a Downsample layer with 2 dims.
  --> settings are:
 in-chn: 1280, out-chn: 1280, kernel-size: 3, stride: 2, padding: 1
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Loaded ViT-H-14 model config.
Loading pretrained ViT-H-14 weights (laion2b_s32b_b79k).
Initialized embedder #0: FrozenOpenCLIPImagePredictionEmbedder with 683800065 params. Trainable: False
Initialized embedder #1: ConcatTimestepEmbedderND with 0 params. Trainable: False
Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Initialized embedder #3: VideoPredictionEmbedderWithEncoder with 83653863 params. Trainable: False
Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Skipping timestep embedding in ResBlock
making attention of type 'vanilla' with 512 in_channels
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Restored from C:\Users\ryban\Documents\stable-diffusion-ui\webui\models\svd\svd_xt.safetensors with 0 missing and 0 unexpected keys
WARNING: The conditioning frame you provided is not 576x1024. This leads to suboptimal performance as model was only trained on 576x1024. Consider increasing `cond_aug`.
C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\utils\checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\utils\checkpoint.py:90: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
!!! Exception during processing !!!
Traceback (most recent call last):
  File "C:\Users\ryban\Documents\ComfyUI\execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "C:\Users\ryban\Documents\ComfyUI\execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "C:\Users\ryban\Documents\ComfyUI\execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\nodes.py", line 190, in sample_video
    samples_z = model.sampler(denoiser, randn, cond=c, uc=uc)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\sampling.py", line 120, in __call__
    x = self.sampler_step(
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\sampling.py", line 99, in sampler_step
    denoised = self.denoise(x, denoiser, sigma_hat, cond, uc)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\sampling.py", line 55, in denoise
    denoised = denoiser(*self.guider.prepare_inputs(x, sigma, cond, uc))
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\nodes.py", line 186, in denoiser
    return model.denoiser(
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\denoiser.py", line 37, in forward
    network(input * c_in, c_noise, cond, **additional_model_inputs) * c_out
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\wrappers.py", line 28, in forward
    return self.diffusion_model(
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\video_model.py", line 484, in forward
    h = module(
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\openaimodel.py", line 91, in forward
    x = layer(x, emb, num_video_frames, image_only_indicator)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\video_model.py", line 69, in forward
    x = super().forward(x, emb)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\openaimodel.py", line 324, in forward
    return checkpoint(self._forward, x, emb)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\_compile.py", line 24, in inner
    return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\_dynamo\eval_frame.py", line 489, in _fn
    return fn(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\_dynamo\external_utils.py", line 17, in inner
    return fn(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\utils\checkpoint.py", line 482, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\autograd\function.py", line 553, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\utils\checkpoint.py", line 261, in forward
    outputs = run_function(*args)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\openaimodel.py", line 336, in _forward
    h = self.in_layers(x)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
    input = module(input)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\util.py", line 276, in forward
    return super().forward(x.float()).type(x.dtype)
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\normalization.py", line 287, in forward
    return F.group_norm(
  File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\functional.py", line 2561, in group_norm
    return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)
torch.cuda.OutOfMemoryError: Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated     : 16.42 GiB
Requested               : 1.65 GiB
Device limit            : 11.00 GiB
Free (according to CUDA): 0 bytes
PyTorch limit (set by user-supplied memory fraction)
                        : 17179869184.00 GiB

Prompt executed in 206.97 seconds

Here are my PC specs:

  • Intel(R) Core(TM) i9-9900KF CPU @ 3.60GHz, 3600 Mhz, 8 Core(s), 16 Logical Processor(s)
  • RAM 32 GB
  • NVIDIA GeForce RTX 2080 Ti 11 GB
  • Microsoft Windows 10 Home 10.0.19045 Build 19045

Could you help me with solving this problem?

Thank you in advance

rybandrei2014 avatar Apr 20 '24 18:04 rybandrei2014

Use this workflow instead: https://comfyanonymous.github.io/ComfyUI_examples/video/

comfyanonymous avatar Apr 21 '24 12:04 comfyanonymous

thank you it helped 👍

rybandrei2014 avatar May 03 '24 20:05 rybandrei2014