ChatGLM-Efficient-Tuning 微调chatglm2需要多少显存？

微调chatglm2需要多少显存？

Open CCzzzzzzz opened this issue 1 year ago • 3 comments

同等参数下，chatglm可以微调，chatglm2就爆显存，是模型优化问题吗？ CUDA_VISIBLE_DEVICES=0 python src/train_sft.py
--do_train
--dataset alpaca_gpt4_zh
--finetuning_type lora
--output_dir path_to_sft_checkpoint
--per_device_train_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 1000
--learning_rate 5e-5
--num_train_epochs 3.0
--fp16
--use_v2
--dev_ratio 0.01 \

Jun 28 '23 10:06 CCzzzzzzz

ChatGLM2 目前官方没实现 gradient checkpointing 逻辑，可以等一等

Jun 28 '23 10:06 hiyouga

如果你着急试试，训练数据不是太长，那么最简单的大量省显存的方式，就是砍输入的maxlength，lora维度也降降。fp16的话，输入砍到128，16G显存都能跑下来。

Jun 29 '23 03:06 pdwfree

如果你着急试试，训练数据不是太长，那么最简单的大量省显存的方式，就是砍输入的maxlength，lora维度也降降。fp16的话，输入砍到128，16G显存都能跑下来。

感谢，这确实是个办法，但是微调后出现大量复读现象，可见 #207 #

Jun 29 '23 03:06 CCzzzzzzz

ChatGLM-Efficient-Tuning ChatGLM-Efficient-Tuning copied to clipboard

微调chatglm2需要多少显存？

ChatGLM-Efficient-Tuning
ChatGLM-Efficient-Tuning copied to clipboard