cywjava issues

Results 11 issues of


                                            cywjava

换了新的模型和配置文件后，不生成内容呢

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 换了新的模型和配置文件后，不生成内容呢 .bias', 'layers.7.mlp.dense_h_to_4h.weight', 'layers.19.attention.query_key_value.bias', 'layers.19.post_attention_layernorm.bias', 'layers.4.post_attention_layernorm.bias', 'layers.6.attention.query_key_value.bias', 'layers.12.attention.query_key_value.bias', 'layers.5.attention.dense.weight', 'layers.17.attention.query_key_value.weight', 'layers.12.input_layernorm.weight',...

微调后，测试问答生成，确实能回答我给他学习的内容，但后面会追加很多其它文本这要怎么解决？

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 微调后，测试问答生成，确实能回答我给他学习的内容，但后面会追加很多其它文本这要怎么解决？ ### Expected Behavior 微调后，测试问答生成，确实能回答我给他学习的内容，但后面会追加很多其它文本这要怎么解决？ ### Steps...

我想做这样一件事，不知道是否可以

### Is your feature request related to a problem? Please describe. 我们现在用这个模型做基础模型，训练自己的知识，这就是微调。同时我想让它忘记原有的内容，相当于清空他的记忆，只回答我微调的内容，其它的都说：不清楚或不知道等等。。大佬们，有没有方案？ ### Solutions 我们现在用这个模型做基础模型，训练自己的知识，这就是微调。同时我想让它忘记原有的内容，相当于清空他的记忆，只回答我微调的内容，其它的都说：不清楚或不知道等等。。大佬们，有没有方案？ ### Additional context _No response_

使用lora 训练后无效果的问题

我这里准备了1500条alpaca数据，问：你是谁，回答：我是XXXXXX 另外有一个新知识也是1500条数据，使用lora训练后，新知识训练几百步后就有效果了，但是想要替换原来6B他里面的内容，已经训练上万步了，还是替换不到，这是怎么个情况呢。

更新了模型后，报这样的错。

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of...

大佬，我用lora 微调完成后，在chekpoint里，挑选了一个比较好的pt，下次我想基于微调后的这个pt，继续微调，你们一般是怎么弄的？

### Is your feature request related to a problem? Please describe. 大佬，我用lora微调完成后，在chekpoint里，挑选了一个比较好的pt，下次我想基于微调后的这个pt，继续微调，你们一般是怎么弄的？ ### Solutions 大佬，我用lora微调完成后，在chekpoint里，挑选了一个比较好的pt，下次我想基于微调后的这个pt，继续微调，你们一般是怎么弄的？ ### Additional context 大佬，我用lora微调完成后，在chekpoint里，挑选了一个比较好的pt，下次我想基于微调后的这个pt，继续微调，你们一般是怎么弄的？

解决爆24G显存的方法

官方代码测试： (python3.8) [baichuan@localhost baichuan-7B]$ python3 generate.py The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function. 登鹳雀楼->王之涣夜雨寄北->李商隐过零丁洋->文天祥己亥杂诗(其五)->龚自珍

question

我的语料文本很多，能不能分开多个train.json

我的语料文本很多，能不能分开多个train.json, 那训练的时候，是一个一个train.json挨着训练吗，我尝试了，训练一个后，接着训练第二个，发现模型文件大小没变化。

大佬，帮忙看看这个错误。

使用lora 微调完成后，我来测试这个模型。如果使用了model.eval()方法，则会报错 File "/home/thudm/.local/lib/python3.7/site-packages/peft/tuners/lora.py", line 420, in train groups=sum(self.enable_lora), RuntimeError: Expected 4-dimensional input for 4-dimensional weight [8192, 8, 1, 1], but got 3-dimensional input of size [1, 16, 4096]...

训练后没有效果，我换了data2里面的内容后，又报如下错误。。

/MyTrainer.py", line 819, in _get_train_sampler return RandomSampler(self.train_dataset, generator=generator) File "/home/thudm/.local/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 108, in __init__ "value, but got num_samples={}".format(self.num_samples)) ValueError: num_samples should be a positive integer value, but got num_samples=0