ptuning v2微调的时候突然loss异常增大,然后接着loss就为0了
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
ptuning v2微调的时候突然loss异常增大,然后接着loss就为0了。是不是某种bug造成了梯度爆炸,把学习率从2e-2 一直调小到1e-3都不行
Expected Behavior
Steps To Reproduce
python main.py ^ --do_train ^ --do_eval ^ --train_file D:/AI/ChatGLM-6B/ptuning/dataset/train.json ^ --validation_file D:/AI/ChatGLM-6B/ptuning/dataset/test.json ^ --prompt_column prompt ^ --response_column response ^ --model_name_or_path THUDM/chatglm-6b ^ --output_dir D:/AI/ChatGLM-6B/ptuning/output/adgen-chatglm-6b-pt-700-80w-5e-3_resume ^ --max_source_length 640 ^ --max_target_length 320 ^ --per_device_train_batch_size 1 ^ --per_device_eval_batch_size 1 ^ --gradient_accumulation_steps 16 ^ --predict_with_generate ^ --max_steps 30000 ^ --logging_steps 50 ^ --save_steps 1000 ^ --learning_rate 1e-3 ^ --pre_seq_len 700
Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
Anything else?
No response