DiffSynth-Studio icon indicating copy to clipboard operation
DiffSynth-Studio copied to clipboard

LoRA trained on Wan2.1-I2V-14B-480P does not work

Open nextl368 opened this issue 1 month ago • 2 comments

Version: 1.1.8

I am encountering an issue where a LoRA trained on the Wan2.1-I2V-14B-480P (Image-to-Video) base model does not produce the expected video output.

Successful Scenario (Working)

  • A LoRA was successfully trained using the Wan2.1-T2V-14B (Text-to-Video) base model.
  • The video generated using the T2V-trained LoRA exhibits the desired effect.
  • Crucially: This T2V-trained LoRA also works correctly when loaded onto the Wan2.1-I2V-14B-480P base model.

Problem Scenario (Not Working)

  • A LoRA was trained directly using the Wan2.1-I2V-14B-480P base model.
  • When this I2V-trained LoRA is applied, the generated video does not exhibit the desired effect.

Training and Data Details

  • Training Commands: I used the exact same command line given in the WanVideo examples section for both the T2V and I2V training runs.
  • Dataset: The dataset is identical for both training attempts. It consists of an input image and an associated prompt with just the trigger keyword.
  • Metadata Example: I am deliberately omitting a specific descriptive prompt in the metadata.csv to focus on the style keyword:
video,prompt
training_image_1.png,"N0Y1V2R3S" 
training_image_2.png,"N0Y1V2R3S"
...
training_image_250.png,"N0Y1V2R3S"

ComfyUI Usage (Context, Not the Issue)

For testing, I used ComfyUI, ensuring I loaded the LoRA via the WanVideo-specific nodes. (Note: Standard ComfyUI LoRA loading nodes fail due to key name differences in the safetensor file produced by DiffSynth-Studio, but I have verified this is not the root cause of the I2V training failure.)

Summary of the Problem

The core issue is that I2V-native LoRA training is failing to capture the desired style/effect, even though the same dataset, training command, and environment allow for successful T2V LoRA training that can then be applied to the I2V model.

nextl368 avatar Nov 20 '25 10:11 nextl368

The same, not wrking for wan22-14B-Animate. get the bellow warning, i checked the grads and all lora grads were None! UserWarning: None of the inputs have requires_grad=True. Gradients will be None

AshkanTaghipour avatar Nov 21 '25 01:11 AshkanTaghipour

The same, not wrking for wan22-14B-Animate. get the bellow warning, i checked the grads and all lora grads were None! UserWarning: None of the inputs have requires_grad=True. Gradients will be None

I don’t think this is my case. I wrote a script to compare the LoRA from the T2V and I2V base models. Structurally, they are identical , both files contain the exact same set of keys. I also checked tensor types and dimensions, and there’s no issue there. Then I inspected the tensors themselves to detect any NaN or Inf values, all good. After that, I verified the LoRA weights for unusual patterns, such as zero standard deviation or abnormally high magnitude. Everything looks normal.

To ensure ComfyUI wasn’t causing the problem, I also ran inference using the commands provided in the LoRA Training Validation table. Still the same issue, the LoRA trained with the I2V produces no noticeable effect.

nextl368 avatar Nov 21 '25 13:11 nextl368