MiniCPM-V
MiniCPM-V copied to clipboard
[BUG] web_demos/minicpm-o_2.6/model_server.py加载MiniCPM-o-2.6-Int4错误
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
(MiniCPMO) D:\LLM\MiniCPM-o>python web_demos/minicpm-o_2.6/model_server.py
Traceback (most recent call last):
File "D:\LLM\MiniCPM-o\web_demos\minicpm-o_2.6\model_server.py", line 601, in <module>
stream_manager = StreamManager()
File "D:\LLM\MiniCPM-o\web_demos\minicpm-o_2.6\model_server.py", line 96, in __init__
self.minicpmo_model = AutoModel.from_pretrained(self.minicpmo_model_path, trust_remote_code=True, torch_dtype=self.target_dtype, attn_implementation='sdpa', low_cpu_mem_usage=True)
File "E:\Programming\pycodes\miniconda3\envs\MiniCPMO\lib\site-packages\transformers\models\auto\auto_factory.py", line 559, in from_pretrained
return model_class.from_pretrained(
File "E:\Programming\pycodes\miniconda3\envs\MiniCPMO\lib\site-packages\transformers\modeling_utils.py", line 3738, in from_pretrained
if metadata.get("format") == "pt":
AttributeError: 'NoneType' object has no attribute 'get'
期望行为 | Expected Behavior
正常加载
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS:Windows11 24h2
- Python:3.10.16
- Transformers:4.44.2
- PyTorch:2.2.0+cu121
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):
(MiniCPMO) D:\LLM\MiniCPM-o>python -c "import torch; print(torch.version.cuda)"
12.1
(实际上安装的是CUDA12.4)
### 备注 | Anything else?
_No response_
https://huggingface.co/openbmb/MiniCPM-o-2_6-int4/discussions/1
Looking forward to detailed readme today :)
你好,可以根据这个 README 步骤安装 AutoGPTQ 使用 int4 量化推理 You can follow the steps in this README to install AutoGPTQ and perform int4 quantized inference.
你好,可以根据这个 README 步骤安装 AutoGPTQ 使用 int4 量化推理 You can follow the steps in this README to install AutoGPTQ and perform int4 quantized inference.
这是改哪个文件?
官方提供了解决方法,十分有效: https://huggingface.co/openbmb/MiniCPM-o-2_6-int4
首先要安装 minicpmo 分支的 AutoGPTQ:
git clone https://github.com/OpenBMB/AutoGPTQ.git
cd AutoGPTQ
git checkout minicpmo
# install AutoGPTQ
pip install -vvv --no-build-isolation -e .
然后只需修改加载模型的代码:
import torch
from transformers import AutoModel, AutoTokenizer
from auto_gptq import AutoGPTQForCausalLM
model = AutoGPTQForCausalLM.from_quantized(
'openbmb/MiniCPM-o-2_6-int4',
torch_dtype=torch.bfloat16,
device="cuda:0",
trust_remote_code=True,
disable_exllama=True,
disable_exllamav2=True
)
tokenizer = AutoTokenizer.from_pretrained(
'openbmb/MiniCPM-o-2_6-int4',
trust_remote_code=True
)
model.init_tts()