Qwen3 icon indicating copy to clipboard operation
Qwen3 copied to clipboard

Qwen1.5-7B-base模型推理速度比Qwen1.5-7B-chat模型速度慢很多

Open waltonfuture opened this issue 2 years ago • 3 comments

如题。我使用的huggingface格式的推理。请问是什么原因呢?

waltonfuture avatar Mar 08 '24 03:03 waltonfuture

我也是,我感觉不是7b-base慢了,Qwen1.5-7B-chat变快了

endNone avatar Mar 10 '24 04:03 endNone

求教哪里有qwen1.5-base的推理脚本

xyw7 avatar Mar 11 '24 08:03 xyw7

我是用fastchat的

endNone avatar Mar 13 '24 00:03 endNone

They are of same code and same model arch. Check the generation length of them. For chat models, it will automatically stop with the generation of im_end token, so probably its generation length is much shorter

JustinLin610 avatar Apr 21 '24 09:04 JustinLin610