Qwen3
Qwen3 copied to clipboard
Qwen1.5-7B-base模型推理速度比Qwen1.5-7B-chat模型速度慢很多
如题。我使用的huggingface格式的推理。请问是什么原因呢?
我也是,我感觉不是7b-base慢了,Qwen1.5-7B-chat变快了
求教哪里有qwen1.5-base的推理脚本
我是用fastchat的
They are of same code and same model arch. Check the generation length of them. For chat models, it will automatically stop with the generation of im_end token, so probably its generation length is much shorter