ChatGLM-Efficient-Tuning issues

能否在同一模型上针对多任务进行多次微调，之前微调获得的能力是否能够保留？

3

RT，针对完全不同的任务，每一个都去训练单独的模型消耗比较大，基于同一个模型累加训练是否可行？原来学到的能力是否能够保留？如果可以，推荐用哪种微调方案呢？感谢~

Karenlyw

pending

loss已经非常低了，怎么模型回答的和label答案还有那么大出入

12

训练：accelerate launch src/train_sft.py --do_train --dataset adgen_train --finetuning_type lora --output_dir path_to_sft_checkpoint --per_device_train_batch_size 2 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --logging_steps 290 --save_steps 290 --learning_rate 1e-3 --num_train_epochs 50 --fp16 --use_v2 --model_name_or_path THUDM/chatglm2-6b --quantization_bit 4...

shipengcheng1

pending

效果不理想

我让他做信息提取，他能识别9日前，但识别不了9日后，提取得一直不完全，而且我如果单独把9日后那句话拿出来让他重新提取，他也只能提取到做了什么，识别不到9日后这几个字。这是什么问题啊

SolarKnight1

pending

--finetuning_type freeze 错误

2

使用`--finetuning_type freeze`时，如果加载上次训练的结果 `--checkpoint_dir output10/checkpoint-900`继续训练出现下面的错误 ``` File "src/train_sft.py", line 106, in main() File "src/train_sft.py", line 25, in main model, tokenizer = load_pretrained(model_args, finetuning_args, training_args.do_train, stage="sft") File "/root/autodl-tmp/ChatGLM-Efficient-Tuning/src/utils/common.py", line 236, in...

qxde01

pending

ChatGLM2 p-tuning 报错。 RuntimeError: The size of tensor a (247) must match the size of tensor b (231) at non-singleton dimension 3

4

脚本： CUDA_VISIBLE_DEVICES=0 python ../src/train_sft.py \ --do_train \ --model_name_or_path ~/models/pretrain/chatglm2-6b \ --dataset alpaca_gpt4_zh \ --dataset_dir ../data \ --finetuning_type p_tuning \ --output_dir ../output/ \ --overwrite_cache \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \...

lcl1990

pending

RM 训练完成后rm文件夹下没有权重文件

2

![企业微信截图_103c1903-f58e-467c-a371-f875a2dc8304](https://github.com/hiyouga/ChatGLM-Efficient-Tuning/assets/125811027/170edcfa-7948-4ac7-a5f3-989534e4fb32) 在ppo训练时要选择checkpoint 里边的，怎么确定是最优的呢？

chenxw321

pending

glm2训练完测试发现重复生成的情况比较严重

5

glm2训练完测试发现重复生成的情况比较严重，请问有什么好的解决办法吗？已知条件： 1、更新过glm2模型的py文件。 2、使用了1800条左右数据，epoch和学习率均为默认，使用web_demo.py进行测试。

xyfZzz

pending

full tuning 显存不足

2

4卡训练，在vicuna、baichuan都可以正常训练，但是chatglm2显存不足，max-length调到128以及batch size=1 还是不行

chenkejin

pending

chatglm和chatglm2微调时的区别

3

https://openi.pcl.ac.cn/kewei/ChatGLM-6B/src/branch/main/lora/main-secret-new.ipynb 上面这个notebook对于v1的微调有很好的效果但是我将其用在v2上，效果欠佳 ![image](https://github.com/hiyouga/ChatGLM-Efficient-Tuning/assets/84905965/787e0ac1-3b36-438e-8a1a-7bcc9e12c449) ![image](https://github.com/hiyouga/ChatGLM-Efficient-Tuning/assets/84905965/fae269e8-c6a8-43be-bc40-efb5f75449c1) ![image](https://github.com/hiyouga/ChatGLM-Efficient-Tuning/assets/84905965/341c18b1-22b6-4f23-9b38-790411662593) 很多参数变成了none 想请教一下，这些参数出现了怎样的变化？https://openi.pcl.ac.cn/kewei/ChatGLM-6B/src/branch/main/lora/main-secret-new.ipynb 这个notebook对v2需要进行怎样的适配？

Ethan-Chen-plus

pending

support for optional 'history' field

1

jinsongpan

wontfix

ChatGLM-Efficient-Tuning
ChatGLM-Efficient-Tuning copied to clipboard

Metadata

能否在同一模型上针对多任务进行多次微调，之前微调获得的能力是否能够保留？

loss已经非常低了，怎么模型回答的和label答案还有那么大出入

效果不理想

--finetuning_type freeze 错误

ChatGLM2 p-tuning 报错。 RuntimeError: The size of tensor a (247) must match the size of tensor b (231) at non-singleton dimension 3

RM 训练完成后rm文件夹下没有权重文件

glm2训练完测试发现重复生成的情况比较严重

full tuning 显存不足

chatglm和chatglm2微调时的区别

support for optional 'history' field

← Metadata

Owner

Metadata

ChatGLM-Efficient-Tuning ChatGLM-Efficient-Tuning copied to clipboard

Metadata

← Metadata

Owner

Metadata

ChatGLM-Efficient-Tuning
ChatGLM-Efficient-Tuning copied to clipboard