ChatGLM-6B icon indicating copy to clipboard operation
ChatGLM-6B copied to clipboard

[Help] <ptuning时使用了多少条数据获得了较好的效果>

Open xv994 opened this issue 1 year ago • 7 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

想问下各位已经微调过的同志,ptuning要获得较好效果需要使用多少条数据

Expected Behavior

No response

Steps To Reproduce

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

xv994 avatar Apr 20 '23 02:04 xv994

试过2W的诗歌创作数据,效果已经很不错了

white-wolf-tech avatar Apr 20 '23 03:04 white-wolf-tech

如何解决遗忘问题?

我用ptuning标准代码训练以后遗忘问题很严重

import os import torch from transformers import AutoConfig, AutoModel, AutoTokenizer

CHECKPOINT_PATH = "output/adgen-chatglm-6b-pt-128-2e-2/checkpoint-3000"

载入Tokenizer

tokenizer = AutoTokenizer.from_pretrained("chatglm6b", trust_remote_code=True)

config = AutoConfig.from_pretrained("chatglm6b", trust_remote_code=True, pre_seq_len=128) model = AutoModel.from_pretrained("chatglm6b", config=config, trust_remote_code=True) prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, "pytorch_model.bin")) new_prefix_state_dict = {} for k, v in prefix_state_dict.items(): if k.startswith("transformer.prefix_encoder."): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)

print(f"Quantized to 4 bit") model = model.quantize(4) model = model.half().cuda() model.transformer.prefix_encoder.float() model = model.eval()

response, history = model.chat(tokenizer, "糍粑鱼怎么做", history=[]) print(response, history)

输出很无语

('采用<UNK>材料,使糍粑鱼表面变得光滑,让整条裙子看起来更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。', [('糍粑鱼怎么做', '采用<UNK>材料,使糍粑鱼表面变得光滑,让整条裙子看起来 更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。')])

liaoweiguo avatar Apr 20 '23 05:04 liaoweiguo

试过2W的诗歌创作数据,效果已经很不错了

请问您在训练的过程中,max_target_length和文本长度是怎么设置的

xv994 avatar Apr 20 '23 06:04 xv994

如何解决遗忘问题?

我用ptuning标准代码训练以后遗忘问题很严重

import os import torch from transformers import AutoConfig, AutoModel, AutoTokenizer

CHECKPOINT_PATH = "output/adgen-chatglm-6b-pt-128-2e-2/checkpoint-3000"

载入Tokenizer

tokenizer = AutoTokenizer.from_pretrained("chatglm6b", trust_remote_code=True)

config = AutoConfig.from_pretrained("chatglm6b", trust_remote_code=True, pre_seq_len=128) model = AutoModel.from_pretrained("chatglm6b", config=config, trust_remote_code=True) prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, "pytorch_model.bin")) new_prefix_state_dict = {} for k, v in prefix_state_dict.items(): if k.startswith("transformer.prefix_encoder."): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)

print(f"Quantized to 4 bit") model = model.quantize(4) model = model.half().cuda() model.transformer.prefix_encoder.float() model = model.eval()

response, history = model.chat(tokenizer, "糍粑鱼怎么做", history=[]) print(response, history)

输出很无语

('采用材料,使糍粑鱼表面变得光滑,让整条裙子看起来更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。', [('糍粑鱼怎么做', '采用材料,使糍粑鱼表面变得光滑,让整条裙子看起来 更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。')])

使用ptuning微调会导致模型只能回答训练的任务,对其他的任务回答效果会变差,可以尝试一下LoRA微调的方法

xv994 avatar Apr 20 '23 06:04 xv994

LoRA微调的方法有试过的没,是迭代的逻辑吗

guarx avatar Apr 20 '23 07:04 guarx

如何解决遗忘问题? 我用ptuning标准代码训练以后遗忘问题很严重 import os import torch from transformers import AutoConfig, AutoModel, AutoTokenizer CHECKPOINT_PATH = "output/adgen-chatglm-6b-pt-128-2e-2/checkpoint-3000"

载入Tokenizer

tokenizer = AutoTokenizer.from_pretrained("chatglm6b", trust_remote_code=True) config = AutoConfig.from_pretrained("chatglm6b", trust_remote_code=True, pre_seq_len=128) model = AutoModel.from_pretrained("chatglm6b", config=config, trust_remote_code=True) prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, "pytorch_model.bin")) new_prefix_state_dict = {} for k, v in prefix_state_dict.items(): if k.startswith("transformer.prefix_encoder."): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict) print(f"Quantized to 4 bit") model = model.quantize(4) model = model.half().cuda() model.transformer.prefix_encoder.float() model = model.eval() response, history = model.chat(tokenizer, "糍粑鱼怎么做", history=[]) print(response, history) 输出很无语 ('采用材料,使糍粑鱼表面变得光滑,让整条裙子看起来更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。', [('糍粑鱼怎么做', '采用材料,使糍粑鱼表面变得光滑,让整条裙子看起来 更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。')])

使用ptuning微调会导致模型只能回答训练的任务,对其他的任务回答效果会变差,可以尝试一下LoRA微调的方法

lora也是的,链接

jiayi37u avatar Apr 28 '23 08:04 jiayi37u

试过2W的诗歌创作数据,效果已经很不错了

请问你的train.sh的参数是怎么设置的?我尝试了1000条数据,只把lr改为2e-5,其余参数不变,但是基本什么都没有学习到,智能回答它原有的信息。

SSQiana avatar Oct 20 '23 01:10 SSQiana