does it support multiple gpu
System Info / 系統信息
i used i2v and add .to('cuda') with removing offline but it still not using all gpus iam using 4 a10 with 24 vram
Information / 问题信息
- [ ] The official example scripts / 官方的示例脚本
- [x] My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
text_encoder = T5EncoderModel.from_pretrained("THUDM/CogVideoX-5b-I2V", subfolder="text_encoder", torch_dtype=torch.bfloat16)
quantize_(text_encoder, quantization())
transformer = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX-5b-I2V",subfolder="transformer", torch_dtype=torch.bfloat16)
quantize_(transformer, quantization())
vae = AutoencoderKLCogVideoX.from_pretrained("THUDM/CogVideoX-5b-I2V", subfolder="vae", torch_dtype=torch.bfloat16)
quantize_(vae, quantization())
Create pipeline and run inference
pipe = CogVideoXImageToVideoPipeline.from_pretrained( "THUDM/CogVideoX-5b-I2V", text_encoder=text_encoder, transformer=transformer, vae=vae, torch_dtype=torch.bfloat16, ).to('cuda')
Manually assign components to GPUs
pipe.vae.enable_tiling() pipe.vae.enable_slicing()
print(pipe.text_encoder.device) print(pipe.transformer.device) print(pipe.vae.device)
video = pipe( prompt='test', image=image, num_videos_per_prompt=1, num_inference_steps=50, num_frames=49, guidance_scale=6, ).frames[0] out = 'temp.mp4' export_to_video(video, f'{out}', fps=8)
Expected behavior / 期待表现
got only one gpu uitilized
Please check inference/cli_demo.py to see how to distribute to multiple GPUs, but this does not support quantization.
i have tried and got error
pipe = CogVideoXImageToVideoPipeline.from_pretrained( "THUDM/CogVideoX-5b-I2V", torch_dtype=torch.bfloat16, device_map="balanced" )
pipe.scheduler = CogVideoXDPMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing") pipe.vae.enable_tiling() pipe.vae.enable_slicing()
video = pipe( prompt=prompt, image=image, num_videos_per_prompt=1, num_inference_steps=50, num_frames=49, guidance_scale=6, use_dynamic_cfg=True, generator=torch.Generator().manual_seed(112), ).frames[0]
export_to_video(video, "output.mp4", fps=8)
Loading checkpoint shards: 100%|█| 2/2 [00:01<00:00, 1.24it
Loading pipeline components...: 100%|█| 5/5 [00:07<00:00, 1
Traceback (most recent call last):
File "/home/ec2-user/test/t.py", line 21, in
it solved when i used this i dont know why it not working on 4 gpus it only working on 2 gpus
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1
and when i put .to('cuda') igot error also
CUDA_VISIBLE_DEVICES=0,1,2,3 python cli_demo.py ?
it solved when i used this i dont know why it not working on 4 gpus it only working on 2 gpus
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1
Same error. @zRzRzRzRzRzRzR @BASSEM45325 Have you solved it? Thanks
and when i put .to('cuda') igot error also
same error
I got
ValueError: It seems like you have activated sequential model offloading by calling enable_sequential_cpu_offload, but are now attempting to move the pipeline to GPU. This is not compatible with offloading. Please, move your pipeline .to('cpu') or consider removing the move altogether if you use sequential offloading.
enable_sequential_cpu_offload should remove,