CogVideo icon indicating copy to clipboard operation
CogVideo copied to clipboard

GPU usage can not always 100%

Open GFENGG opened this issue 1 year ago • 2 comments
trafficstars

System Info / 系統信息

cuda=12.1 diffusers == 0.30.1

Information / 问题信息

  • [X] The official example scripts / 官方的示例脚本
  • [ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

run the finetune script, and the gpu usage is as bellow. image

And this issue doesn't seem to be caused by eval, (because i have set the eval interval to 100000). why, and how to fix this problem?

Expected behavior / 期待表现

gpu usage keep 100%

GFENGG avatar Aug 29 '24 06:08 GFENGG

When fine-tuning, this is normal because fakecp only optimizes the VAE decoder part without touching the VAE encoder part. We haven't paid much attention to this issue before, we will try to check it out

zRzRzRzRzRzRzR avatar Aug 29 '24 10:08 zRzRzRzRzRzRzR

When fine-tuning, this is normal because fakecp only optimizes the VAE decoder part without touching the VAE encoder part. We haven't paid much attention to this issue before, we will try to check it out

okay, thanks, hope it can be fixed

GFENGG avatar Aug 29 '24 13:08 GFENGG

Currently, if you can preprocess the data in advance instead of following the current code implementation, it will significantly improve GPU performance, as a lot of time is wasted in data decoding. If this part can be preprocessed using the vae encoder, it will save a lot of time.

zRzRzRzRzRzRzR avatar Sep 22 '24 06:09 zRzRzRzRzRzRzR

However, this plan is not suitable for those who want to start training immediately, but rather for enterprise teams that require systematic fine-tuning. Therefore, we will not update this part of the code for the time being, as it would increase the usage cost for entry-level users. Thank you for your understanding.

zRzRzRzRzRzRzR avatar Sep 22 '24 06:09 zRzRzRzRzRzRzR