CogVideo
CogVideo copied to clipboard
Inference but no memory/VRAM using in GPU
System Info / 系統信息
cuda 11.8, A800, pytorch 2.4
Information / 问题信息
- [X] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
python inference/cli_demo.py --prompt "xxx"
Expected behavior / 期待表现
I turn off all of the four options during the inference because I think A800 80G(the text-encode and transformer is about 20G) is enough and get the minimum latency. But it takes me 50 minutes to finish one step and no VRAM using.
When I added the following according to #197 , it takes about 20 seconds to finish one step.
pipe.enable_model_cpu_offload()
pipe.vae_enable_tiling()
If I want to load all the model to GPU, what should I do? thx
remove all of these two code and then pipe.to("cuda")
请问解决了么?同样使用a800,CUDA12.4 尝试了上述的所有,都没有解决。
把scheduling_ddim_cogvideox.py中改一下prev_timestep = int(prev_timestep.to('cpu').item())可以解决~
remove all of these two code and then pipe.to("cuda")
I have made it.