ChatGLM-6B [BUG/Help] <整个代码跑通了，但是，推理速度太慢了？>

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

我的显卡是Tesla T4，16gb显存，在train.sh上finetue，用的是你们提供的数据集共54mb大小，训练的时候发现，速度实在是太慢了，3000个step需要7小时。试了下eval.sh发现需要2个小时推理完整个验证集，我想是不是哪里的参数没有调对，实在太慢了。

Expected Behavior

希望给一个快速训练的方案，看是不是哪里参数配置错误了。如果不是参数错误，可以给一个正常推理和训练速度的显卡配置吗？54mb的数据要7个小时才能训练完，实在是太慢了。之前用llama 1.3b微调数据也就15-20分钟，20mb的英文数据，就训练完毕了。

Steps To Reproduce

我的训练参数配置如下 PRE_SEQ_LEN=128 LR=2e-2

CUDA_VISIBLE_DEVICES=0 python3 main.py
--do_train
--train_file AdvertiseGen/train.json
--validation_file AdvertiseGen/dev.json
--prompt_column content
--response_column summary
--overwrite_cache
--model_name_or_path THUDM/chatglm-6b
--output_dir output/adgen-chatglm-6b-pt-$PRE_SEQ_LEN-$LR
--overwrite_output_dir
--max_source_length 64
--max_target_length 64
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 16
--predict_with_generate
--max_steps 10
--logging_steps 10
--save_steps 10
--learning_rate $LR
--pre_seq_len $PRE_SEQ_LEN
--quantization_bit 4
--cache_dir ./cache_dir

eval的参数如下

PRE_SEQ_LEN=128 CHECKPOINT=adgen-chatglm-6b-pt-128-2e-2 STEP=10

CUDA_VISIBLE_DEVICES=0 python3 main.py
--do_predict
--validation_file AdvertiseGen/dev.json
--test_file AdvertiseGen/dev.json
--overwrite_cache
--prompt_column content
--response_column summary
--model_name_or_path THUDM/chatglm-6b
--ptuning_checkpoint ./output/$CHECKPOINT/checkpoint-$STEP
--output_dir ./output/$CHECKPOINT
--overwrite_output_dir
--max_source_length 64
--max_target_length 64
--per_device_eval_batch_size 1
--predict_with_generate
--pre_seq_len $PRE_SEQ_LEN
--quantization_bit 4
--cache_dir ./cache_dir

Environment

- OS:ubantu
- Python:3.8
- Transformers:4.27.1最新版本
- PyTorch:2.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :11.8

Anything else?

No response

Apr 20 '23 15:04 tomtang110

我的测试也是类似的结论，用原始脚本训练大概3000个step就至少7个小时

Apr 25 '23 05:04 mzh1996

同问，我这边58条数据的训练集，其他数据没改，用p-tuning就已经要4个小时了。也试过gpt3.5提供的微调接口，他们微调只要十几秒……不太明白为什么他们能那么快？

Apr 26 '23 10:04 tykuyh

同问，训练完后，用微调后的模型推理的时候特别慢

May 05 '23 07:05 sun1092469590

gpt3.5?那运算不是在openai那边吗，这个本地怎么比

May 09 '23 17:05 jingyuan2017

ChatGLM-6B ChatGLM-6B copied to clipboard

[BUG/Help] <整个代码跑通了，但是，推理速度太慢了？>

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM-6B
ChatGLM-6B copied to clipboard