DiffSynth-Studio icon indicating copy to clipboard operation
DiffSynth-Studio copied to clipboard

Use DiffSynth-Studio to train i2v model based one wanx1.3 t2v model

Open lith0613 opened this issue 7 months ago • 3 comments

Thank you for providing a very sleek and user-friendly diffusion framework. I’m currently trying to fine-tune the 14B i2v model, but there’s not enough VRAM. Is it possible to import the 1.3B t2v weights into this framework and then train the i2v model? I’ve noticed that the model_manager = ModelManager(torch_dtype=torch.bfloat16, device="cpu") is used to load predefined structured pretrained model weights, which doesn’t seem very convenient for defining a model structure and then importing partial parameters, such as defining an i2v structure and importing t2v weights. Here’s the code I looked at:

` model_manager = ModelManager(torch_dtype=torch.bfloat16, device="cpu")

model_manager.load_models( [ "Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors", "Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth", "Wan2.1-T2V-1.3B/Wan2.1_VAE.pth", ])

self.pipe = WanVideoPipeline.from_model_manager(model_manager) `

This framework only supports the import of pretrained model weights with a defined structure, which is not very convenient for predefining a model structure and then importing partial parameters, such as defining an i2v structure and importing t2v weights.

lith0613 avatar Mar 10 '25 09:03 lith0613