CogVideo issues

About 3D Swin Attention

1

In your description about the dual channel attention, you add the attention-base's and attention-plus's patches in the end. But as the orginal 3D Swin Attention, videos are divided into 3D...

lemon-prog123

super-resolution

1

作者您好，想问一下super-resolution这一步骤的意义和具体操作（在代码中我看到它是第二阶段的一部分），但是我在论文中没有找到对应的讲解。谢谢。

B-Soul

Any descriptions on the dataset for pre-training?

1

Hi authors, Congratulations on your great work! I have read through the paper. I found that there is no description on the source of dataset used for pre-training. Can you...

zhoudaquan

Data source

Great work! I'm curious about the collection of 5.4M pretraining video . Are they crawled from web or a combination of multiple datasets? And are they planned to be released...

MoodyPosh

What code was used for evaluating Fréchet Video Distance(FVD)?

Hi hong and the whole THUDM team, thanks for your hard work and CogVideo seems really interesting! In the "5.1 Machine Evaluation" section of your paper, you mentioned Inception Score(IS)...

Maxlinn

Will be avalible for Windows servers to use CogVideo?

Will be avalible for Windows servers to use CogVideo? Althought this generation is charming for me to have a try, my computer's server is Windows......

lossatsea

Computation Requirement to train CogVideo

1

Hi, First of all, great work in developing CogVideo. Could you please provide information on how many GPUs and how much duration it took to train the model? Thanks Gaurav

g1910

How many frames (seconds) are there in each video sample used in the training process?

1

How many frames (seconds) are there in each video sample used in the training process? Is it the same as the output sample of the 4-second clip of 32 frames?...

BinZhu-ece

Demonstration data

Thanks for the amazing work! can I check where does the demonstration dataset come from? Is there any part publicly available? thanks.

zhoudaquan

A segment fault was encountered during inference

``` (CogVideo) C:\Users\SAS\Desktop\CogVideo-main>sh scripts/inference_cogvideo_pipeline.sh Please install apex to use fused_layer_norm, fall back to torch.nn.LayerNorm WARNING: No training data specified using world size: 1 and model-parallel size: 1 > initializing model...

boyu-chen-intern

CogVideo
CogVideo copied to clipboard

Metadata

About 3D Swin Attention

super-resolution

Any descriptions on the dataset for pre-training?

Data source

What code was used for evaluating Fréchet Video Distance(FVD)?

Will be avalible for Windows servers to use CogVideo?

Computation Requirement to train CogVideo

How many frames (seconds) are there in each video sample used in the training process?

Demonstration data

A segment fault was encountered during inference

← Metadata

Owner

Metadata

CogVideo CogVideo copied to clipboard

Metadata

← Metadata

Owner

Metadata

CogVideo
CogVideo copied to clipboard