Ziqing Yang comments

Results 212 comments of


                                            Ziqing Yang

预训练第一阶段，需要冻结原版LLaMA词表的embedding吗

> 请问 [scripts/run_clm_pt_with_peft.py](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/run_clm_pt_with_peft.py) 中的代码是不是没有记录第一阶段的训练呢，因为我没有找到相关的代码，如果我是错的，你可以帮忙指明一下具体是那几行体现了第一阶段吗是的，[scripts/run_clm_pt_with_peft.py](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/run_clm_pt_with_peft.py) 只进行了第二阶段的预训练

预训练第一阶段，需要冻结原版LLaMA词表的embedding吗

原因在[这里](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/预训练脚本)说了。我们没有放出第一阶段训练的代码，要想做一阶段预训练，在[scripts/run_clm_pt_with_peft.py](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/run_clm_pt_with_peft.py)的基础上简单改改就可以了。

能不能通过领域数据对模型进行微调

有数据的话当然可以。

能不能通过领域数据对模型进行微调

只是换数据而已，方法上和普通的预训练没有区别。

能不能通过领域数据对模型进行微调

预训练的话参考 https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling 里的run_clm.py 精调的话是按[Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)来做的。

能不能通过领域数据对模型进行微调

> > 先训练的题目参考[https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling里的run_clm.py](https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling%E9%87%8C%E7%9A%84run_clm.py) 精调的题目是按[Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)来做的。 > > 您好readme上说加入了20G中文资料继续训练，想参考机器配置和训练时间，我想在垂领域继续训练，做个参考使用LoRA的方式预训练7B模型，在20G语料上，八卡A100-48G训练1轮大约120小时；当然机器更多会更快。

能不能通过领域数据对模型进行微调

> > 微调的数据比较好理解, 通常是input+labels方式输入即可, 预训练的时候, 数据是怎么组织? input labels怎么来的? CLM方式预训练，inputs向左偏移一位就是对应的labels

能不能通过领域数据对模型进行微调

> @airaria 多谢! 这里transformer里面有可以参考的代码么? 你注意看下，LlamaForCausalLM中forward方法下有相应的处理代码可参考

能不能通过领域数据对模型进行微调

> 词表合并这里, 我理解是将: 遍历llama模型的tokenizer, 将它的词表, 添加到sentencepiece生成的词表后面, 这个逻辑没问题吧? 这其中 score设置为0, 会有问题么? 到目前为止，我们还没发现这么做有什么影响

能不能通过领域数据对模型进行微调

> > 作者，你好，我在用我的数据集上继续指令微调的时候遇到了loss为0的问题，想问一下作者或者其他人有没遇到同样的问题，问题、代码、训练参数下所示：代码：` > > ``` > > tokenizer = LlamaTokenizer.from_pretrained(“ziqingyang/chinese-alpaca-lora-7b”, padding_side="left") > > base_model = LlamaForCausalLM.from_pretrained( > > "decapoda-research/llama-7b-hf", > > # load_in_8bit=True, > > load_in_8bit=False, > >...