Baichuan-7B
Baichuan-7B copied to clipboard
[Question] 当继续预训练是,loss一直是2.2几的状态,请问作者预训练阶段也是如此吗?
Required prerequisites
- [X] I have read the documentation https://github.com/baichuan-inc/baichuan-7B/blob/HEAD/README.md.
- [X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
- [X] Consider asking first in a Discussion.
Questions
我自己继续预训练,采用loara方法,训练参数大约在1千万,120w条数据,训练3轮,发现loss降低很少,始终维持在2.2几的样子,想问下这个正常吗?因为我之前没有NLP的经验。
Checklist
- [X] I have provided all relevant and necessary information above.
- [X] I have chosen a suitable title for this issue.
请问您找到原因了吗
遇到的相同的问题,loss在2.3左右震荡