Qwen
Qwen copied to clipboard
[BUG] pull了最新的14b模型相同的代码微调后loss相对之前的高了不少,请问是做了什么改动吗
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
No response
期望行为 | Expected Behavior
No response
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):
备注 | Anything else?
No response
请问是做了什么改动吗?
14B的话,模型参数是没有更新的,相关代码上有一些模型间统一的变化,主要是速度相关的优化。这部分变动应该不会影响训练结果才对,方便说下loss大概高了多少吗?
请问是做了什么改动吗?
14B的话,模型参数是没有更新的,相关代码上有一些模型间统一的变化,主要是速度相关的优化。这部分变动应该不会影响训练结果才对,方便说下loss大概高了多少吗?
之前训练完的loss大概是0.001x,现在试了几次是0.03x
能问下用多少配置的服务器跑的,谢谢
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. 此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决,请在此帖下方留言以补充信息。