MiniCPM-V
MiniCPM-V copied to clipboard
[vllm] - <title> KeyError: 'llm.layers.0.mlp.down_proj.weight' when run with MiniCPM-V-2_6-int4
起始日期 | Start Date
8/16/2024
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
Load model weight error when run with MiniCPM-V-2_6-int4.
vllm environment: docker image: vllm/vllm-openai:v0.5.4 pip install bitsandbytes==0.43.3
model download from: git clone https://www.modelscope.cn/OpenBMB/MiniCPM-V-2_6-int4.git
基本示例 | Basic Example
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
MODEL_NAME = "models/MiniCPM-V-2_6-int4"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
llm = LLM(
model=MODEL_NAME,
gpu_memory_utilization=1,
trust_remote_code=True,
max_model_len=2048,
enforce_eager=True,
)
缺陷 | Drawbacks
[rank0]: Traceback (most recent call last):nBMB/MiniCPM-V/code_vllm# nano +685 /usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/minicpmv.py
[rank0]: File "/home/tangent/AIChat/engines/OpenBMB/MiniCPM-V/code_vllm/local_try.py", line 13, in <module>
[rank0]: llm = LLM(
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py", line 158, in __init__
[rank0]: self.llm_engine = LLMEngine.from_engine_args(
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 445, in from_engine_args
[rank0]: engine = cls(
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 249, in __init__
[rank0]: self.model_executor = executor_class(
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/executor/executor_base.py", line 47, in __init__
[rank0]: self._init_executor()
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 36, in _init_executor
[rank0]: self.driver_worker.load_model()
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 139, in load_model
[rank0]: self.model_runner.load_model()
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 722, in load_model
[rank0]: self.model = get_model(model_config=self.model_config,
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/__init__.py", line 21, in get_model
[rank0]: return loader.load_model(model_config=model_config,
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 327, in load_model
[rank0]: model.load_weights(
[rank0]: File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/minicpmv.py", line 685, in load_weights
[rank0]: param = params_dict[name]
[rank0]: KeyError: 'llm.layers.0.mlp.down_proj.weight'
未解决问题 | Unresolved questions
No response
你好,你现在用vllm跑的是bnb的量化模型,bnb量化的模型应该是不支持vllm推理的