[Bug] InternVL3-14B-AWQ vllm部署报错 KeyError: 'layers.44.mlp.down_proj.qweight'
Checklist
- [x] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
- [x] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
InternVL3-14B-AWQ 量化版本部署报错 KeyError: 'layers.44.mlp.down_proj.qweight' vllm 版本 0.7.3/0.8.4 都报错 模型weight:https://modelscope.cn/models/OpenGVLab/InternVL3-14B-AWQ/summary
Reproduction
vllm serve /xxxx/InternVL3-14B-AWQ
--dtype auto
--port 8000
--limit_mm_per_prompt image=4
--max_model_len 8784
--gpu_memory_utilization 0.45
--trust-remote-code
Environment
vllm 0.7.3/0.8.4
torch 2.5.1
Error traceback
在vllm 0.7.3版本下修改命令 vllm serve /xxxx/InternVL3-14B-AWQ --dtype half --port 8000 --limit_mm_per_prompt image=4 --max_model_len 8784 --gpu_memory_utilization 0.45 --trust-remote-code --quantization awq 可以部署成功 在vllm 0.8.4版本下,部署成功,但无法处理请求。无响应
我也遇到了相同的问题,似乎是因为从nn.Module.named_parameters()得到的params_dict与模型实际的layer结构不匹配导致的。