gpt-fast icon indicating copy to clipboard operation
gpt-fast copied to clipboard

Does it support the reasoning acceleration of Qwen-14B?

Open dashi6174 opened this issue 1 year ago • 4 comments

Qwen-14B: https://github.com/QwenLM/Qwen

dashi6174 avatar Dec 04 '23 08:12 dashi6174

It's similar to the llama architecture, so it should be easy to modify model.py to support it.

Chillee avatar Dec 04 '23 21:12 Chillee

I have tested it with Qwen-1.8B on RTX 2080, and the reasoning acceleration is about twice the time compared to the original (50 tok/s vs ~100 tok/s) which is fascinating. Considering the Owen series has the same architecture, I thought it should be working for Owen-14B.

DongqiShen avatar Dec 07 '23 03:12 DongqiShen

ascinating. Considering t

3qs(Thank u),I will give it a try.

dashi6174 avatar Dec 08 '23 06:12 dashi6174

@dashi6174 https://github.com/DongqiShen/qwen-fast

DongqiShen avatar Dec 08 '23 15:12 DongqiShen