What are the GPU requirements for the VACE 14B model?
I'm currently working with the 14B version of the VACE model. I'm using an NVIDIA L20 GPU with 48GB of VRAM. Additionally, I've used offload model and t5 cpu techniques, but I'm still encountering out-of-memory (OOM) issues.
I'm wondering what specific GPU requirements the 14B version of the VACE model has. Also, if there are any specific configurations or additional optimizations related to the GPU that can help mitigate the OOM problem, please share. Thank you!
Me too. I use the A800 with 80G . There is a program that uses 24G and still has 57G left, which also prompts CUDA OOM
I'm currently working with the 14B version of the VACE model. I'm using an NVIDIA L20 GPU with 48GB of VRAM. Additionally, I've used offload model and t5 cpu techniques, but I'm still encountering out-of-memory (OOM) issues.
I'm wondering what specific GPU requirements the 14B version of the VACE model has. Also, if there are any specific configurations or additional optimizations related to the GPU that can help mitigate the OOM problem, please share. Thank you!
now I use diffsynth run vace. By modifying and adding some configurations, you can run it successfully below 40G VRAM.
- first you should install
diffsynth
git clone https://github.com/modelscope/DiffSynth-Studio.git
cd DiffSynth-Studio
pip install -e .
- [diffsynth/configs/model_config.py]
- line 63 modified
from ..models.wan_video_vace import VaceWanModel, VaceWanModel14B - line 130 add
(None, "7a513e1f257a861512b1afd387a8ecd9", ["wan_video_dit", "wan_video_vace"], [WanModel, VaceWanModel14B], "civitai"),
- [diffsynth/models/wan_video_vace.py]
- last line add
class VaceWanModel14B(VaceWanModel):
def __init__(
self,
vace_layers=(0, 5, 10, 15, 20, 25, 30, 35),
vace_in_dim=96,
patch_size=(1, 2, 2),
has_image_input=False,
dim=5120,
num_heads=40,
ffn_dim=13824,
eps=1e-6,
):
super().__init__(vace_layers, vace_in_dim, patch_size, has_image_input, dim, num_heads, ffn_dim, eps)
- then you can use run [DiffSynth-Studio\examples\wanvideo\wan_1.3b_vace.py]
- modified
model_managerto VACE-14B(you should snapshot_download Wan2.1-VACE-14B and Put it in the appropriate position) - change
pipe.enable_vram_management(num_persistent_param_in_dit=None)None to 7*10**9 - cd wanvideo and run wan_1.3b_vace.py
it requires the rtx pro 6000, all the gpu's listed are redundant and dated, and china would kill for them anyone would. sadly they're affordable and accessable in america and taiwan, but getting one will be impossible. i know a few people that have one on campus and one testing at home and they say it puts the fear of god in nvidia.
You can run any model on 20gig of vram as long as its quantised, yet that requires a generous person to do that.
Great job on your progress so far! However, you've encountered an issue when following the steps to modify the model_manager for the VACE-14B model. You mentioned that the "VACE-Wan2.1-1.3B-Preview" is a single file "diffusion_pytorch_model.safetensors", while the downloaded "Wan2.1-VACE-14B" consists of seven sliced files. You deployed the code as follows:
model_manager.load_models(
[
# "/media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model.safetensors.index.json",
"models_weight/diffusion_pytorch_model-00001-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00002-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00003-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00004-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00005-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00006-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00007-of-00007.safetensors",
"/media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth",
"/media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth",
],
torch_dtype=torch.bfloat16,
)
But the execution result shows:
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00001-of-00007.safetensors
We cannot detect the model type. No models are loaded.
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00002-of-00007.safetensors
We cannot detect the model type. No models are loaded.
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00003-of-00007.safetensors
We cannot detect the model type. No models are loaded.
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00004-of-00007.safetensors
We cannot detect the model type. No models are loaded.
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00005-of-00007.safetensors
We cannot detect the model type. No models are loaded.
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00006-of-00007.safetensors
We cannot detect the model type. No models are loaded.
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00007-of-00007.safetensors
We cannot detect the model type. No models are loaded.
Loading models from: /media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth
model_name: wan_video_text_encoder model_class: WanTextEncoder
The following models are loaded: ['wan_video_text_encoder'].
Loading models from: /media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth
model_name: wan_video_vae model_class: WanVideoVAE
The following models are loaded: ['wan_video_vae'].
Using wan_video_text_encoder from /media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth.
No wan_video_dit models available.
Using wan_video_vae from /media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth.
No wan_video_image_encoder models available.
No wan_video_motion_controller models available.
No wan_video_vace models available.
Traceback (most recent call last):
File "/media/yons/WIN10/prog/seamless_communication/VACE/DiffSynth_Studio/examples/wanvideo/wan_14b_vace.py", line 29, in <module>
pipe.enable_vram_management(num_persistent_param_in_dit=7*10**9)
File "/media/yons/WIN10/prog/seamless_communication/VACE/DiffSynth_Studio/diffsynth/pipelines/wan_video.py", line 63, in enable_vram_management
dtype = next(iter(self.dit.parameters())).dtype
AttributeError: 'NoneType' object has no attribute 'parameters'
The question is whether the 14B model files need to be merged into a single file or if there's a specific download source that should be used. Thank you very much for your help! @leiwang1023
Great job on your progress so far! However, you've encountered an issue when following the steps to modify the
model_managerfor the VACE-14B model. You mentioned that the "VACE-Wan2.1-1.3B-Preview" is a single file "diffusion_pytorch_model.safetensors", while the downloaded "Wan2.1-VACE-14B" consists of seven sliced files. You deployed the code as follows:model_manager.load_models( [ # "/media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model.safetensors.index.json", "models_weight/diffusion_pytorch_model-00001-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00002-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00003-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00004-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00005-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00006-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00007-of-00007.safetensors", "/media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth", "/media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth", ], torch_dtype=torch.bfloat16, ) But the execution result shows:
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00001-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00002-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00003-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00004-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00005-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00006-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00007-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth model_name: wan_video_text_encoder model_class: WanTextEncoder The following models are loaded: ['wan_video_text_encoder']. Loading models from: /media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth model_name: wan_video_vae model_class: WanVideoVAE The following models are loaded: ['wan_video_vae']. Using wan_video_text_encoder from /media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth. No wan_video_dit models available. Using wan_video_vae from /media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth. No wan_video_image_encoder models available. No wan_video_motion_controller models available. No wan_video_vace models available. Traceback (most recent call last): File "/media/yons/WIN10/prog/seamless_communication/VACE/DiffSynth_Studio/examples/wanvideo/wan_14b_vace.py", line 29, in <module> pipe.enable_vram_management(num_persistent_param_in_dit=7*10**9) File "/media/yons/WIN10/prog/seamless_communication/VACE/DiffSynth_Studio/diffsynth/pipelines/wan_video.py", line 63, in enable_vram_management dtype = next(iter(self.dit.parameters())).dtype AttributeError: 'NoneType' object has no attribute 'parameters'The question is whether the 14B model files need to be merged into a single file or if there's a specific download source that should be used. Thank you very much for your help! @leiwang1023
After steps 0,1,2 finished. according to the rules of diffsynth, you need to include the dit models in a list. Like this
model_manager.load_models(
[
# "/media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model.safetensors.index.json",
[
"models_weight/diffusion_pytorch_model-00001-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00002-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00003-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00004-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00005-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00006-of-00007.safetensors",
"models_weight/diffusion_pytorch_model-00007-of-00007.safetensors",
],
"/media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth",
"/media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth",
],
torch_dtype=torch.bfloat16,
)
I'm currently working with the 14B version of the VACE model. I'm using an NVIDIA L20 GPU with 48GB of VRAM. Additionally, I've used offload model and t5 cpu techniques, but I'm still encountering out-of-memory (OOM) issues. I'm wondering what specific GPU requirements the 14B version of the VACE model has. Also, if there are any specific configurations or additional optimizations related to the GPU that can help mitigate the OOM problem, please share. Thank you!
now I use diffsynth run vace. By modifying and adding some configurations, you can run it successfully below 40G VRAM.
- first you should install
diffsynthgit clone https://github.com/modelscope/DiffSynth-Studio.git cd DiffSynth-Studio pip install -e .
- [diffsynth/configs/model_config.py]
- line 63 modified
from ..models.wan_video_vace import VaceWanModel, VaceWanModel14B- line 130 add
(None, "7a513e1f257a861512b1afd387a8ecd9", ["wan_video_dit", "wan_video_vace"], [WanModel, VaceWanModel14B], "civitai"),
- [diffsynth/models/wan_video_vace.py]
- last line add
class VaceWanModel14B(VaceWanModel): def init( self, vace_layers=(0, 5, 10, 15, 20, 25, 30, 35), vace_in_dim=96, patch_size=(1, 2, 2), has_image_input=False, dim=5120, num_heads=40, ffn_dim=13824, eps=1e-6, ): super().init(vace_layers, vace_in_dim, patch_size, has_image_input, dim, num_heads, ffn_dim, eps) 3. then you can use run [DiffSynth-Studio\examples\wanvideo\wan_1.3b_vace.py]
- modified
model_managerto VACE-14B(you should snapshot_download Wan2.1-VACE-14B and Put it in the appropriate position)- change
pipe.enable_vram_management(num_persistent_param_in_dit=None)None to 7*10**9- cd wanvideo and run wan_1.3b_vace.py
I'd like to ask how long it would take to start an inference in your way🤔
Great job on your progress so far! However, you've encountered an issue when following the steps to modify the
model_managerfor the VACE-14B model. You mentioned that the "VACE-Wan2.1-1.3B-Preview" is a single file "diffusion_pytorch_model.safetensors", while the downloaded "Wan2.1-VACE-14B" consists of seven sliced files. You deployed the code as follows: model_manager.load_models( ["/media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model.safetensors.index.json",
"models_weight/diffusion_pytorch_model-00001-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00002-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00003-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00004-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00005-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00006-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00007-of-00007.safetensors", "/media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth", "/media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth", ], torch_dtype=torch.bfloat16, ) But the execution result shows:
Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00001-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00002-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00003-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00004-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00005-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00006-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model-00007-of-00007.safetensors We cannot detect the model type. No models are loaded. Loading models from: /media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth model_name: wan_video_text_encoder model_class: WanTextEncoder The following models are loaded: ['wan_video_text_encoder']. Loading models from: /media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth model_name: wan_video_vae model_class: WanVideoVAE The following models are loaded: ['wan_video_vae']. Using wan_video_text_encoder from /media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth. No wan_video_dit models available. Using wan_video_vae from /media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth. No wan_video_image_encoder models available. No wan_video_motion_controller models available. No wan_video_vace models available. Traceback (most recent call last): File "/media/yons/WIN10/prog/seamless_communication/VACE/DiffSynth_Studio/examples/wanvideo/wan_14b_vace.py", line 29, in <module> pipe.enable_vram_management(num_persistent_param_in_dit=7*10**9) File "/media/yons/WIN10/prog/seamless_communication/VACE/DiffSynth_Studio/diffsynth/pipelines/wan_video.py", line 63, in enable_vram_management dtype = next(iter(self.dit.parameters())).dtype AttributeError: 'NoneType' object has no attribute 'parameters'The question is whether the 14B model files need to be merged into a single file or if there's a specific download source that should be used. Thank you very much for your help! @leiwang1023
After steps 0,1,2 finished. according to the rules of diffsynth, you need to include the dit models in a list. Like this
model_manager.load_models( [ # "/media/yons/WIN10/prog/VACE_models/diffusion_pytorch_model.safetensors.index.json", [ "models_weight/diffusion_pytorch_model-00001-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00002-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00003-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00004-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00005-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00006-of-00007.safetensors", "models_weight/diffusion_pytorch_model-00007-of-00007.safetensors", ], "/media/yons/WIN10/prog/VACE_models/models_t5_umt5-xxl-enc-bf16.pth", "/media/yons/WIN10/prog/VACE_models/Wan2.1_VAE.pth", ], torch_dtype=torch.bfloat16, )
Thanks for your help. I tried this, but still got error. Any suggestions? Thanks in advance.
model_manager = ModelManager(device="cpu")
model_manager.load_models(
[
# "models/Wan2.1-VACE-14B/diffusion_pytorch_model.safetensors.index.json",
[
"models/Wan2.1-VACE-14B/diffusion_pytorch_model-00001-of-00007.safetensors",
"models/Wan2.1-VACE-14B/diffusion_pytorch_model-00002-of-00007.safetensors",
"models/Wan2.1-VACE-14B/diffusion_pytorch_model-00003-of-00007.safetensors",
"models/Wan2.1-VACE-14B/diffusion_pytorch_model-00004-of-00007.safetensors",
"models/Wan2.1-VACE-14B/diffusion_pytorch_model-00005-of-00007.safetensors",
"models/Wan2.1-VACE-14B/diffusion_pytorch_model-00006-of-00007.safetensors",
"models/Wan2.1-VACE-14B/diffusion_pytorch_model-00007-of-00007.safetensors",
],
"models/Wan2.1-VACE-14B/models_t5_umt5-xxl-enc-bf16.pth",
"models/Wan2.1-VACE-14B/Wan2.1_VAE.pth",
],
torch_dtype=torch.bfloat16,
)
I successfully ran the model on two H20 GPUs. Both cards were still running other programs using over 16GB of memory each, yet my peak memory usage reached 79,460 MB and 79,360 MB, respectively. The generation speed was approximately 28~40 seconds per iteration.
The command I used was: CUDA_VISIBLE_DEVICES=2,3 torchrun --nproc_per_node=2 vace/vace_wan_inference.py --dit_fsdp --t5_fsdp --ulysses_size 2 --ring_size 1 --ckpt_dir models/ --model_name vace-14B --src_video "benchmarks/assets/examples/gray/src_video.mp4" --prompt "镜头缓缓向右平移,身穿淡黄色坎肩长裙的长发女孩面对镜头露出灿烂的漏齿微笑。她的长发随风轻扬,眼神明亮而充满活力。背景是秋天红色和黄色的树叶,阳光透过树叶的缝隙洒下斑驳光影,营造出温馨自然的氛围。画面风格清新自然,仿佛夏日午后的一抹清凉。中景人像,强调自然光效和细腻的皮肤质感。"