MiniCPM-V [BUG] web_demos/minicpm-o_2.6/model_server.py加载MiniCPM-o-2.6-Int4错误

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

[X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

[X] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

(MiniCPMO) D:\LLM\MiniCPM-o>python web_demos/minicpm-o_2.6/model_server.py
Traceback (most recent call last):
  File "D:\LLM\MiniCPM-o\web_demos\minicpm-o_2.6\model_server.py", line 601, in <module>
    stream_manager = StreamManager()
  File "D:\LLM\MiniCPM-o\web_demos\minicpm-o_2.6\model_server.py", line 96, in __init__
    self.minicpmo_model = AutoModel.from_pretrained(self.minicpmo_model_path, trust_remote_code=True, torch_dtype=self.target_dtype, attn_implementation='sdpa', low_cpu_mem_usage=True)
  File "E:\Programming\pycodes\miniconda3\envs\MiniCPMO\lib\site-packages\transformers\models\auto\auto_factory.py", line 559, in from_pretrained
    return model_class.from_pretrained(
  File "E:\Programming\pycodes\miniconda3\envs\MiniCPMO\lib\site-packages\transformers\modeling_utils.py", line 3738, in from_pretrained
    if metadata.get("format") == "pt":
AttributeError: 'NoneType' object has no attribute 'get'

期望行为 | Expected Behavior

正常加载

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:Windows11 24h2
- Python:3.10.16
- Transformers:4.44.2
- PyTorch:2.2.0+cu121
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

(MiniCPMO) D:\LLM\MiniCPM-o>python -c "import torch; print(torch.version.cuda)"
12.1

（实际上安装的是CUDA12.4）



### 备注 | Anything else?

_No response_

Jan 15 '25 15:01 lingyezhixing

https://huggingface.co/openbmb/MiniCPM-o-2_6-int4/discussions/1

Looking forward to detailed readme today :)

Jan 16 '25 00:01 1WorldCapture

你好，可以根据这个 README 步骤安装 AutoGPTQ 使用 int4 量化推理 You can follow the steps in this README to install AutoGPTQ and perform int4 quantized inference.

Jan 17 '25 07:01 YuzaChongyi

你好，可以根据这个 README 步骤安装 AutoGPTQ 使用 int4 量化推理 You can follow the steps in this README to install AutoGPTQ and perform int4 quantized inference.

这是改哪个文件？

Jan 17 '25 11:01 sunsetzks

官方提供了解决方法，十分有效： https://huggingface.co/openbmb/MiniCPM-o-2_6-int4

首先要安装 minicpmo 分支的 AutoGPTQ：

git clone https://github.com/OpenBMB/AutoGPTQ.git
cd AutoGPTQ
git checkout minicpmo

# install AutoGPTQ
pip install -vvv --no-build-isolation -e .

然后只需修改加载模型的代码：

import torch
from transformers import AutoModel, AutoTokenizer
from auto_gptq import AutoGPTQForCausalLM

model = AutoGPTQForCausalLM.from_quantized(
    'openbmb/MiniCPM-o-2_6-int4',
    torch_dtype=torch.bfloat16,
    device="cuda:0",
    trust_remote_code=True,
    disable_exllama=True,
    disable_exllamav2=True
)
tokenizer = AutoTokenizer.from_pretrained(
    'openbmb/MiniCPM-o-2_6-int4',
    trust_remote_code=True
)

model.init_tts()

Feb 02 '25 13:02 cpp-qn