Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
直接运行 bash ds_train_finetune.sh 会报以下错误:
Traceback (most recent call last):
File "/data/zhangsm/chatglm/ChatGLM2-6B/ptuning/main.py", line 411, in
main()
File "/data/zhangsm/chatglm/ChatGLM2-6B/ptuning/main.py", line 350, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/data/zhangsm/chatglm/ChatGLM2-6B/ptuning/trainer.py", line 1635, in train
return inner_training_loop(
File "/data/zhangsm/chatglm/ChatGLM2-6B/ptuning/trainer.py", line 1704, in _inner_training_loop
deepspeed_engine, optimizer, lr_scheduler = deepspeed_init(
TypeError: deepspeed_init() got an unexpected keyword argument 'resume_from_checkpoint'
即便在 trainer.py 中去掉 'resume_from_checkpoint' 这个参数,再执行又会遇到另一个错误:
Traceback (most recent call last):
File "/data/zhangsm/chatglm/ChatGLM2-6B/ptuning/main.py", line 411, in
main()
File "/data/zhangsm/chatglm/ChatGLM2-6B/ptuning/main.py", line 350, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/data/zhangsm/chatglm/ChatGLM2-6B/ptuning/trainer.py", line 1635, in train
return inner_training_loop(
File "/data/zhangsm/chatglm/ChatGLM2-6B/ptuning/trainer.py", line 1704, in _inner_training_loop
deepspeed_engine, optimizer, lr_scheduler = deepspeed_init(
File "/data/zhangsm/anaconda3/envs/chatglm/lib/python3.10/site-packages/transformers/deepspeed.py", line 340, in deepspeed_init
hf_deepspeed_config = trainer.accelerator.state.deepspeed_plugin.hf_ds_config
AttributeError: 'Seq2SeqTrainer' object has no attribute 'accelerator'
Expected Behavior
No response
Steps To Reproduce
- replace "THUDM/chatglm2-6b" with the local path "/data/zhangsm/chatglm/chatglm2-6b"
- bash ds_train_finetune.sh
Environment
- Python: 3.10
- Transformers: 4.30.2
- PyTorch: 1.13
- CUDA Support: True
Anything else?
No response
我遇到了同样的问题, 似乎是trainer不兼容新版的transformers
transformers 4.29.2版本有 resume_from_checkpoint 参数。
单 GPU 能运行 ds_train_finetune.sh 吗? 需要多少显存?
友情链接:复现清华chatGLM2-6B模型指令微调训练模型(2023-07-04 18:00)开源:
https://github.com/THUDM/ChatGLM2-6B
可以支持GPU CPU训练、支持chatGLM-6B微调训练数据格式,和alpaca指令微调训练数据格式。
但CPU训练我用了fp32未量化微调训练,很是耗费内存。推荐大家使用GPU训练。
相关文档我会加紧编辑上传,力挺清华GLM社区!