CogVideo
CogVideo copied to clipboard
How to start with 4090?
Great open-source work! I'd like to suggest adding some instructions to the README. Specifically, there are missing instructions for users with 24GB of VRAM and those with more than 24GB of VRAM on how to quickly set up the environment, download models, and deploy inference. The current format might leave interested users unsure of how to start.
Thanks for your reminder, we have added the quick start instruction at the beginning of the readme.
Hello, can you specify which version of Python you are using? I am encountering a conflict between Python 3.9's typing module and beartype version 0.18.5.
Traceback (most recent call last):
File "/home/CogVideo/sat/sample_video.py", line 18, in <module>
from diffusion_video import SATVideoDiffusionEngine
File "/home/CogVideo/sat/diffusion_video.py", line 12, in <module>
from sgm.modules import UNCONDITIONAL_CONFIG
File "/home/CogVideo/sat/sgm/__init__.py", line 1, in <module>
from .models import AutoencodingEngine
File "/home/CogVideo/sat/sgm/models/__init__.py", line 1, in <module>
from .autoencoder import AutoencodingEngine
File "/home/CogVideo/sat/sgm/models/autoencoder.py", line 29, in <module>
from ..modules.cp_enc_dec import _conv_split, _conv_gather
File "/home/CogVideo/sat/sgm/modules/cp_enc_dec.py", line 8, in <module>
from beartype import beartype
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/site-packages/beartype/__init__.py", line 58, in <module>
from beartype._decor.decormain import (
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/site-packages/beartype/_decor/decormain.py", line 26, in <module>
from beartype._conf.confcls import (
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/site-packages/beartype/_conf/confcls.py", line 46, in <module>
from beartype._conf.confoverrides import (
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/site-packages/beartype/_conf/confoverrides.py", line 15, in <module>
from beartype._data.hint.datahinttyping import (
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/site-packages/beartype/_data/hint/datahinttyping.py", line 290, in <module>
BeartypeReturn = Union[BeartypeableT, BeartypeConfedDecorator]
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/typing.py", line 243, in inner
return func(*args, **kwds)
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/typing.py", line 316, in __getitem__
return self._getitem(self, parameters)
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/typing.py", line 421, in Union
parameters = _remove_dups_flatten(parameters)
File "/home/envs/miniconda3/envs/cogvideo/lib/python3.9/typing.py", line 215, in _remove_dups_flatten
all_params = set(params)
TypeError: unhashable type: 'list'
We use python 3.11. Sorry for not specifying the specific python version. We will clarify in Readme.
Thanks, I also find the issue in beartype and then I will use the correct Python version. https://github.com/beartype/beartype/issues/406#issuecomment-2211954903
now it is support diffuser framework using 4090
Hello, will the results with the diffuser be better than those with the SAT? Besides, I find that the generated video of rigid objects will have a nice view, but it will be bad when there are people and multiple objects. If I want to generate videos with consistent geometry, what kind of prompts will be better in your experience? I think it may be related to the training data. Thanks!
Hello, will the results with the diffuser be better than those with the SAT? Besides, I find that the generated video of rigid objects will have a nice view, but it will be bad when there are people and multiple objects. If I want to generate videos with consistent geometry, what kind of prompts will be better in your experience? I think it may be related to the training data. Thanks!
- In theory, the two are the same. But in practice I think sat may be a little better, because the diffusers version is converted from the sat version, and there will inevitably be a reasonable difference between them.
- You can use the prompt optimization method mentioned in quick start, and replacing glm-4 with gpt-4o may produce better results.