CogVideo LoRA finetuning checkpoint size

LoRA finetuning checkpoint size

Open TianxingWu opened this issue 1 year ago • 3 comments

System Info / 系統信息

CogVideoX-2B SAT LoRA finetuning

Information / 问题信息

[ ] The official example scripts / 官方的示例脚本
[X] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

Finetune the 2B model with official LoRA setting

Expected behavior / 期待表现

I suppose the size of the saved checkpoint mp_rank_00_model_states.pt should be similar to the original transformer checkpoint. However, it increases significantly from 3.5G to 26G. I checked the checkpoint and found that the number of items in modules increases from 557 to 1453, containing not only the LoRA-related weights, but also weights of conditioner(T5) and first_stage_model (VAE), which in my opinion should not be contained in the checkpoint, as T5 and VAE are not to be changed in the finetuning process.

Sep 06 '24 10:09 TianxingWu

this exports the entire model because torch.save was used when saving.

Sep 06 '24 17:09 zRzRzRzRzRzRzR

Thanks for the reply! Just thought it would be nice to separate LoRA weights from others.

Sep 12 '24 21:09 TianxingWu

check here https://github.com/THUDM/CogVideo/tree/main/sat#using-the-fine-tuned-model

Sep 13 '24 03:09 zRzRzRzRzRzRzR

CogVideo CogVideo copied to clipboard

LoRA finetuning checkpoint size

System Info / 系統信息

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

CogVideo
CogVideo copied to clipboard