ComfyUI
ComfyUI copied to clipboard
Getting out of memory during execution of SVD workflow
Hello,
I am getting out of memory exception when trying to run following SVD workflow:
Here is the output from command line
got prompt
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Building a Downsample layer with 2 dims.
--> settings are:
in-chn: 320, out-chn: 320, kernel-size: 3, stride: 2, padding: 1
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Building a Downsample layer with 2 dims.
--> settings are:
in-chn: 640, out-chn: 640, kernel-size: 3, stride: 2, padding: 1
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Building a Downsample layer with 2 dims.
--> settings are:
in-chn: 1280, out-chn: 1280, kernel-size: 3, stride: 2, padding: 1
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Loaded ViT-H-14 model config.
Loading pretrained ViT-H-14 weights (laion2b_s32b_b79k).
Initialized embedder #0: FrozenOpenCLIPImagePredictionEmbedder with 683800065 params. Trainable: False
Initialized embedder #1: ConcatTimestepEmbedderND with 0 params. Trainable: False
Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Initialized embedder #3: VideoPredictionEmbedderWithEncoder with 83653863 params. Trainable: False
Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Skipping timestep embedding in ResBlock
making attention of type 'vanilla' with 512 in_channels
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Skipping timestep embedding in ResBlock
Restored from C:\Users\ryban\Documents\stable-diffusion-ui\webui\models\svd\svd_xt.safetensors with 0 missing and 0 unexpected keys
WARNING: The conditioning frame you provided is not 576x1024. This leads to suboptimal performance as model was only trained on 576x1024. Consider increasing `cond_aug`.
C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\utils\checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\utils\checkpoint.py:90: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn(
!!! Exception during processing !!!
Traceback (most recent call last):
File "C:\Users\ryban\Documents\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "C:\Users\ryban\Documents\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "C:\Users\ryban\Documents\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\nodes.py", line 190, in sample_video
samples_z = model.sampler(denoiser, randn, cond=c, uc=uc)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\sampling.py", line 120, in __call__
x = self.sampler_step(
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\sampling.py", line 99, in sampler_step
denoised = self.denoise(x, denoiser, sigma_hat, cond, uc)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\sampling.py", line 55, in denoise
denoised = denoiser(*self.guider.prepare_inputs(x, sigma, cond, uc))
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\nodes.py", line 186, in denoiser
return model.denoiser(
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\denoiser.py", line 37, in forward
network(input * c_in, c_noise, cond, **additional_model_inputs) * c_out
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\wrappers.py", line 28, in forward
return self.diffusion_model(
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\video_model.py", line 484, in forward
h = module(
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\openaimodel.py", line 91, in forward
x = layer(x, emb, num_video_frames, image_only_indicator)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\video_model.py", line 69, in forward
x = super().forward(x, emb)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\openaimodel.py", line 324, in forward
return checkpoint(self._forward, x, emb)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\_compile.py", line 24, in inner
return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\_dynamo\eval_frame.py", line 489, in _fn
return fn(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\_dynamo\external_utils.py", line 17, in inner
return fn(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\utils\checkpoint.py", line 482, in checkpoint
return CheckpointFunction.apply(function, preserve, *args)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\autograd\function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\utils\checkpoint.py", line 261, in forward
outputs = run_function(*args)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\openaimodel.py", line 336, in _forward
h = self.in_layers(x)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
input = module(input)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\ryban\Documents\ComfyUI\custom_nodes\ComfyUI-Stable-Video-Diffusion\libs\sgm\modules\diffusionmodules\util.py", line 276, in forward
return super().forward(x.float()).type(x.dtype)
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\modules\normalization.py", line 287, in forward
return F.group_norm(
File "C:\Users\ryban\Documents\ComfyUI\venv\lib\site-packages\torch\nn\functional.py", line 2561, in group_norm
return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)
torch.cuda.OutOfMemoryError: Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 16.42 GiB
Requested : 1.65 GiB
Device limit : 11.00 GiB
Free (according to CUDA): 0 bytes
PyTorch limit (set by user-supplied memory fraction)
: 17179869184.00 GiB
Prompt executed in 206.97 seconds
Here are my PC specs:
- Intel(R) Core(TM) i9-9900KF CPU @ 3.60GHz, 3600 Mhz, 8 Core(s), 16 Logical Processor(s)
- RAM 32 GB
- NVIDIA GeForce RTX 2080 Ti 11 GB
- Microsoft Windows 10 Home 10.0.19045 Build 19045
Could you help me with solving this problem?
Thank you in advance
Use this workflow instead: https://comfyanonymous.github.io/ComfyUI_examples/video/
thank you it helped 👍