WangZeJun comments

Results 62 comments of


                                            WangZeJun

裁剪词表

参考：https://github.com/yangjianxin1/LLMPruner

如果想要进行多轮对话，要用什么样的对话模板呢

多轮对话可以参考 fastchat 项目里的训练代码： https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train.py

如果想要进行多轮对话，要用什么样的对话模板呢

可参考最近开源的项目：https://github.com/zejunwang1/LLMTuner 多轮对话数据输入形式、支持全量参数、LoRA 和 QLoRA 微调

将 deepspeed 的配置文件修改为： { "train_batch_size": "auto", "train_micro_batch_size_per_gpu": "auto", "gradient_accumulation_steps": "auto", "gradient_clipping": "auto", "fp16": { "enabled": "auto", "loss_scale": 0, "initial_scale_power": 16, "loss_scale_window": 1000, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled":...

qlora单机多卡微调baichuan2-13b问题

你试试 7b 的 baichuan 模型单机多卡能跑通吗

qlora单机多卡微调baichuan2-13b问题

我明天过去排查一下在 2023-12-03 20:25:10，"zxm8601" ***@***.***> 写道：你试试 7b 的 baichuan 模型单机多卡能跑通吗试过了，也会报这个错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you...

qlora单机多卡微调baichuan2-13b问题

你在训练命令添加一行参数试试： --ddp_find_unused_parameters True

qlora单机多卡微调baichuan2-13b问题

你设置 gradient_checkpointing 为 True 了吗

qlora单机多卡微调baichuan2-13b问题

你在训练的 sh 文件里添加一行参数试试： --ddp_find_unused_parameters True

qlora单机多卡微调baichuan2-13b问题

你单卡训练正常吗