Yuxuan Zhang comments

Results 551 comments of


                                            Yuxuan Zhang

CogVideoX1.5 paper is ?tech report?

We don’t have a technical report on CogVideoX1.5. The principles and architecture of this model are very similar to version 1.0, so you can refer to the 1.0 version.

[rank0]: OSError: t5-v1_1-xxl is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

In our SAT, it is mentioned that the corresponding T5 module should be downloaded from HuggingFace for CogVideoX2B / 5B. The T5 module is loaded separately. You can refer to...

[rank0]: OSError: t5-v1_1-xxl is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

The T5 model needs to be extracted separately from the CogVideoX diffusers version and integrated into a new file, or you can use another T5 in the safetensor format. This...

the last few seconds of the video are static.

We have reproduced this situation, currently some seeds are normal (such as 84) but seed 42 is indeed having issues, we will take a look

docs: add Japanese README

We have made significant updates to the README. Thank you for your support. Could you please align it with the latest README? We would greatly appreciate it.

请问screenspot测试的参数、prompt细节会开源吗？

你可以在我们的技术文档中查看我们拼接prompt的所有细节，我们测试时候的使用的就是这样的组合，现在推理的超参数就是内部测试的时候使用。关于训练技术，可以查看技术报告。

为什么我使用2B模型需要5分钟才能生成一个视频？

是的，这是正常现象，测试的数据是用A100测的，且关闭了所有的显存优化，如果按照cli demo的显存优化，速度还会慢一点，T4卡的时间是5-10分钟一个视频

video with more then 6 seconds

Cannot, but currently the diffuser provides a method for video2video continuation, and there is also a release, which might be helpful to you https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/cogvideo/pipeline_cogvideox_video2video.py

Jetson AGX Orin上推理glm4v-9b，无法停止

我估计是stoptoken没有检测到，你是BF16吗