FastVideo [Feature] Full Training for vsa

Motivation

Can I perform full training for VSA rather than fine-tuning ? If so, how should I modify the scripts or training code? Looking forward for your reply.

Related resources

No response

Jul 14 '25 13:07 clytze0216

Did you mean pre-training?

Jul 14 '25 21:07 BrianChen1129

Did you mean pre-training?

yes, I don't want to perform fine-tuning on Wan- 1.3B. Instead, I want to train it from scratch. How can I achieve this?

Jul 15 '25 02:07 clytze0216

It’s not supported yet, but we plan to add it in the future.

Jul 15 '25 09:07 BrianChen1129

It’s not supported yet, but we plan to add it in the future.

can u give some ideas for how to pre-training of vsa? thanks a lot!

Jul 17 '25 03:07 clytze0216

pretraining diffusion models has much higher requirements on data and requires images or starting from an image model checkpoint, perhaps the wan or hunyuanvideo tech report may be of use:

https://arxiv.org/pdf/2503.20314
https://arxiv.org/pdf/2412.03603

Aug 29 '25 04:08 SolitaryThinker