MiniCPM [Bug]: Exception ValueError when running inference example

Is there an existing issue ? / 是否已有相关的 issue ?

[X] I have searched, and there is no existing issue. / 我已经搜索过了，没有相关的 issue。

Describe the bug / 描述这个 bug

Exception when running the example: python inference.py --model_path <vllmcpm_repo_path> --prompt_path prompts/prompt_demo.txt

ValueError: The checkpoint you are trying to load has model type cpm_dragonfly but Transformers does not recognize this architecture. This could be because of an issue with the chec kpoint, or because your version of Transformers is out of date.

To Reproduce / 如何复现

Follow quick setup guide to setup on Ubuntu 20.04 and python 3.10 Run example: cd inference/vllm/examples/infer_cpm python inference.py --model_path <vllmcpm_repo_path> --prompt_path prompts/prompt_demo.txt

Expected behavior / 期望的结果

Run the example without error

Screenshots / 截图

No response

Environment / 环境

- OS: Ubuntu 20.04
- Pytorch: torch 2.2.0
- CUDA: CUDA 12.0
- Device: RTX3080

Additional context / 其他信息

Below patch workaround the problem:

diff --git a/inference/vllm/vllm/transformers_utils/config.py b/inference/vllm/vllm/transformers_utils/config.py
index 15ca432..625952c 100644
--- a/inference/vllm/vllm/transformers_utils/config.py
+++ b/inference/vllm/vllm/transformers_utils/config.py
@@ -38,7 +38,7 @@ def get_config(model: str,
                 raise RuntimeError(err_msg) from e
             else:
                 raise e
-    except KeyError as e:
+    except ValueError as e:
         if os.path.exists(model):
             config = {}
             with open(f"{model}/config.json", 'r') as fin:

Feb 04 '24 06:02 bismack163

Hi, do you convert hf model files to vllm-based format like the tutorial in readme?

Feb 04 '24 12:02 SUDA-HLT-ywfang

Hi, do you convert hf model files to vllm-based format like the tutorial in readme?

Yes

Feb 05 '24 01:02 bismack163

i get the error, too.

(bei_MiniCPM) [search@search-chatGLM-02 infer_cpm]$ python inference.py --model_path /data/search/bei/MiniCPM/vllmcpm_MiniCPM-2B-dpo-fp16  --prompt_path prompts/prompt_demo.txt
Traceback (most recent call last):
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1117, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 813, in __getitem__
    raise KeyError(key)
KeyError: 'cpm_dragonfly'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/search/bei/MiniCPM/inference/vllm/examples/infer_cpm/inference.py", line 43, in <module>
    llm = LLM(model=args.model_path, tensor_parallel_size=1, dtype='bfloat16')
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 93, in __init__
    self.llm_engine = LLMEngine.from_engine_args(engine_args)
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 225, in from_engine_args
    engine_configs = engine_args.create_engine_configs()
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 187, in create_engine_configs
    model_config = ModelConfig(self.model, self.tokenizer,
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/vllm/config.py", line 91, in __init__
    self.hf_config = get_config(self.model, trust_remote_code, revision)
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/vllm/transformers_utils/config.py", line 40, in get_config
    raise e
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/vllm/transformers_utils/config.py", line 28, in get_config
    config = AutoConfig.from_pretrained(
  File "/home/search/miniconda3/envs/bei_MiniCPM/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1119, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `cpm_dragonfly` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

Feb 05 '24 07:02 sunbeibei-hub

https://github.com/OpenBMB/MiniCPM/issues/9 transforms version have question

Feb 05 '24 07:02 sunbeibei-hub

Sorry for the inconvenience, we will look into this bug very soon.

Feb 05 '24 16:02 ShengdingHu

Sorry for the inconvenience, could you try transformers==4.34.0? It seems like the error handling part has been changed in current version of transformers. We will fix inference/vllm/vllm/transformers_utils/config.py soon.

Feb 07 '24 13:02 huangyuxiang03

It seems like this problem is caused by incorrect version of transformers. Please use transformers==4.34.0 and refer to #9

Mar 01 '24 03:03 huangyuxiang03