CogVideo XDIT multi GPUs inference code runs OOM with 24 frames

@OleehyO thanks for the multi-GPU inference code provided here and mentioned here.

But even with two 32GB GPUs, with a similar setup and increasing the number of frames to 24 frames, with height 480 and width 720, it goes "Out of Memory"

Is there a limit to the number of frames, or a linear relationship between number of GPUs and number of frames?

export OMP_NUM_THREADS=8 
torchrun --nproc_per_node=2 tools/parallel_inference/parallel_inference_xdit.py --model THUDM/CogVideoX-2b --ulysses_degree 1 --ring_degree 1 --use_cfg_parallel --height 480 --width 720 --num_frames 24 --prompt 'A small dog.'

thanks

Feb 04 '25 17:02 danielajisafe

xdit mainly uses multiple cards to accelerate inference speed, but it will not save a lot of memory, and for more details, it is recommended to consult the developers related to xdit.

If you just want to run the inference of the cogvideox model, it is recommended to use our repository's cli_demo.py. A 32GB GPU is sufficient to run any cogvideox model.

Feb 05 '25 11:02 OleehyO

thanks @OleehyO

But this technically suggests integrating xdit to CogVideoX is more of a data parallelism benefit, not model parallelism, or no? Also, the original poster asked for something fast and memory-friendly. Which with two GPUs is not the case. thanks

Feb 06 '25 19:02 danielajisafe