Qwen2.5
Qwen2.5 copied to clipboard
[Badcase]: Loss does not drop when using Liger Kernel at Qwen2.5
Has this been raised before?
- [X] I have checked the GitHub README.
- [X] I have checked the Qwen documentation and cannot find an answer there.
- [X] I have searched the issues and there is not a similar one.
- [X] I confirm that this is not a bug report, a feature request, or a badcase.
Description
Hello,
In my case:
I am trying to instruction tuning Qwen2.5-14B-Instruct with Liger Kernel.
There is my question:
I know that the liger kernel is supported in the dev version of huggingface transformers. However, when training the Qwen2.5 model with Liger Kernel, the loss value does not drop. Not supported yet at Qwen2.5?