CogVideo icon indicating copy to clipboard operation
CogVideo copied to clipboard

video extention

Open HaoZhang990127 opened this issue 1 year ago • 9 comments

Feature request / 功能建议

Hi,

the readme mention video continuation, is it use i2v to extend the video generated by t2v? Do you have any other suggestions for video extention? b77ea983a344ac3ae788b6a964347118

Motivation / 动机

video extention

Your contribution / 您的贡献

video extention

HaoZhang990127 avatar Sep 25 '24 03:09 HaoZhang990127

This job is not at the model level, you can check the implementation of CogVideoXVideoToVideoPipeline in diffusers

zRzRzRzRzRzRzR avatar Sep 25 '24 04:09 zRzRzRzRzRzRzR

Thank you for your reply. As I understand CogVideoXVideoToVideoPipeline is used for that writing on the video style, it doesn't serve to extend the video duration, if I want to get a 12s or longer video, do you have any suggestions? Is there something wrong with my understanding? Thank you for your reply.

HaoZhang990127 avatar Sep 25 '24 05:09 HaoZhang990127

Hi @HaoZhang990127 ,

Currently , All the models from CogvideoX can generate maximum 6 seconds video, with fps=8.

Neethan54 avatar Sep 25 '24 07:09 Neethan54

ok, thank you for your reply.

HaoZhang990127 avatar Sep 25 '24 07:09 HaoZhang990127

@HaoZhang990127 How much time it is taking for you to generate 6 second video?

Neethan54 avatar Sep 25 '24 07:09 Neethan54

Yes, this is actually generating the next 6-second video, for each individual video, the duration does not exceed 6 seconds

zRzRzRzRzRzRzR avatar Sep 25 '24 08:09 zRzRzRzRzRzRzR

Yes, this is actually generating the next 6-second video, for each individual video, the duration does not exceed 6 seconds

zRzRzRzRzRzRzR avatar Sep 25 '24 08:09 zRzRzRzRzRzRzR

@HaoZhang990127 How much time it is taking for you to generate 6 second video?

around 5 min in h800

HaoZhang990127 avatar Sep 25 '24 12:09 HaoZhang990127

2024-09-26_20h35_01 I modified app.py using the code below as a reference. Images are taken from any frame of the video and then spliced ​​together after the video is generated by I2V.

However, i2V has a tendency to suddenly lose movement in the footage, such as camera work, so the footage looks unnatural when spliced ​​together. It seems to be effective when there is little camera work or movement.

I look forward to CogVideo's development as open source.

CogStudio

Enchante503 avatar Sep 26 '24 11:09 Enchante503