CogVideo
CogVideo copied to clipboard
Finetune img2video based on T2V model
Thank you very much for your work. We have attempted to finetune the img2video model on our own dataset, but we found that most of the generated scenes tend to be static. Specifically, when we use driving videos, the output is often a video where the ego-vehicle perspective remains stationary.
Did you encounter a similar issue during your finetuning process, or could this be due to the fact that most of the current models are trained on videos with fixed camera view?