DiffSynth-Studio icon indicating copy to clipboard operation
DiffSynth-Studio copied to clipboard

Ran Wan2.1 VACE1.3B lora finetuning but have weird result

Open rickyyuan07 opened this issue 3 months ago • 1 comments

Hi,

I used lora to finetune the Wan2.1 VACE1.3B model with the script examples/wanvideo/model_training/lora/Wan2.1-VACE-1.3B.sh. I used only one data pair and trained with all default parameters in the training script. The inferred result has a weird flickering effect, could someone give me an insight into what could go wrong?

Input data:

Metadata:

video,prompt,vace_video,vace_video_mask
vid01.mp4,"a photorealistic, cinematic, high-fashion commercial, 360° video of a [V] toy rubber duck as the main product and other toys around it",vid01_vace_video.mp4,vid01_vace_video_mask.mp4

https://github.com/user-attachments/assets/995fb1a7-7905-491b-a24f-b95cb97727b1

https://github.com/user-attachments/assets/eb69efac-84a4-485c-9eba-8ae9e92451f6

https://github.com/user-attachments/assets/3fd453e6-305e-4113-841b-dd1233056698

Inferred result:

https://github.com/user-attachments/assets/e142a0b7-d341-4533-858c-8fca0d85c192

rickyyuan07 avatar Nov 20 '25 05:11 rickyyuan07

@Artiprocher In this we are training on just a single sample and inferring on the same sample to see if it is able to learn to inpaint identity. However, as you can see, there is weird flickering plus the beak of the duck has changed. What should be the correct way to train the lora to learn personalized inpainting?

cs-mshah avatar Nov 20 '25 22:11 cs-mshah