Qwen2.5
Qwen2.5 copied to clipboard
[Badcase]: 相同的数据,微调时在qwen2.5 72B预训练模型上的loss是qwen2 72B的3倍,请问2.5除了数据变多了,其他有什么不一样吗
Model Series
Qwen2.5
What are the models used?
Qwen2.5-72B预训练模型
What is the scenario where the problem happened?
text to sql
Is this badcase known and can it be solved using avaiable techniques?
- [X] I have followed the GitHub README.
- [X] I have checked the Qwen documentation and cannot find a solution there.
- [X] I have checked the documentation of the related framework and cannot find useful information.
- [X] I have searched the issues and there is not a similar one.
Information about environment
OS: Ubuntu 22.04 Python: Python 3.11 GPUs: 8 x NVIDIA A100
Description
如标题所描述