版本号
vllm==0.10.0
启动命令
CUDA_VISIBLE_DEVICES=4,5,6,7 vllm serve /data/models/InternVL3_5-241B-A28B
--tensor-parallel-size 4
--trust-remote-code
报错信息
INFO 09-10 10:02:12 [init.py:235] Automatically detected platform cuda.
INFO 09-10 10:02:14 [api_server.py:1755] vLLM API server version 0.10.0
INFO 09-10 10:02:14 [cli_args.py:261] non-default args: {'model_tag': '/data/models/InternVL3_5-241B-A28B', 'model': '/data/models/InternVL3_5-241B-A28B', 'trust_remote_code': True, 'tensor_parallel_size': 4}
INFO 09-10 10:02:20 [config.py:1604] Using max model len 40960
INFO 09-10 10:02:20 [config.py:2434] Chunked prefill is enabled with max_num_batched_tokens=8192.
Traceback (most recent call last):
File "/home/youchangxin/miniconda3/envs/vllm-test/bin/vllm", line 7, in
sys.exit(main())
^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/entrypoints/cli/main.py", line 54, in main
args.dispatch_function(args)
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/entrypoints/cli/serve.py", line 52, in cmd
uvloop.run(run_server(args))
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/uvloop/init.py", line 109, in run
return __asyncio.run(
^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/asyncio/runners.py", line 195, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/uvloop/init.py", line 61, in wrapper
return await main
^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1791, in run_server
await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1811, in run_server_worker
async with build_async_engine_client(args, client_config) as engine_client:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 158, in build_async_engine_client
async with build_async_engine_client_from_engine_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 194, in build_async_engine_client_from_engine_args
async_llm = AsyncLLM.from_vllm_config(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 163, in from_vllm_config
return cls(
^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 100, in init
self.tokenizer = init_tokenizer_from_configs(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/transformers_utils/tokenizer_group.py", line 111, in init_tokenizer_from_configs
return TokenizerGroup(
^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/transformers_utils/tokenizer_group.py", line 24, in init
self.tokenizer = get_tokenizer(self.tokenizer_id, **tokenizer_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/transformers_utils/tokenizer.py", line 259, in get_tokenizer
raise e
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/vllm/transformers_utils/tokenizer.py", line 238, in get_tokenizer
tokenizer = AutoTokenizer.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/youchangxin/miniconda3/envs/vllm-test/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 1145, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.InternVL3_5-241B-A28B.configuration_internvl_chat.InternVLChatConfig'> to build an AutoTokenizer.
Model type should be one of Aimv2Config, AlbertConfig, AlignConfig, ArceeConfig, AriaConfig, AyaVisionConfig, BarkConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BitNetConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, BloomConfig, BridgeTowerConfig, BrosConfig, CamembertConfig, CanineConfig, ChameleonConfig, ChineseCLIPConfig, ClapConfig, CLIPConfig, CLIPSegConfig, ClvpConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, ColPaliConfig, ColQwen2Config, ConvBertConfig, CpmAntConfig, CTRLConfig, Data2VecAudioConfig, Data2VecTextConfig, DbrxConfig, DebertaConfig, DebertaV2Config, DeepseekV2Config, DeepseekV3Config, DeepseekVLConfig, DeepseekVLHybridConfig, DiaConfig, DiffLlamaConfig, DistilBertConfig, DPRConfig, ElectraConfig, Emu3Config, ErnieConfig, Ernie4_5Config, Ernie4_5_MoeConfig, ErnieMConfig, EsmConfig, Exaone4Config, FalconConfig, FalconMambaConfig, FastSpeech2ConformerConfig, FlaubertConfig, FNetConfig, FSMTConfig, FunnelConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, Gemma3nConfig, Gemma3nTextConfig, GitConfig, GlmConfig, Glm4Config, Glm4MoeConfig, Glm4vConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GptOssConfig, GPTJConfig, GPTSanJapaneseConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeHybridConfig, GraniteMoeSharedConfig, GroundingDinoConfig, GroupViTConfig, HeliumConfig, HubertConfig, IBertConfig, IdeficsConfig, Idefics2Config, Idefics3Config, InstructBlipConfig, InstructBlipVideoConfig, InternVLConfig, JambaConfig, JanusConfig, JetMoeConfig, JukeboxConfig, Kosmos2Config, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LiltConfig, LlamaConfig, Llama4Config, Llama4TextConfig, LlavaConfig, LlavaNextConfig, LlavaNextVideoConfig, LlavaOnevisionConfig, LongformerConfig, LongT5Config, LukeConfig, LxmertConfig, M2M100Config, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MgpstrConfig, MiniMaxConfig, MistralConfig, MixtralConfig, MllamaConfig, MMGroundingDinoConfig, MobileBertConfig, ModernBertConfig, MoonshineConfig, MoshiConfig, MPNetConfig, MptConfig, MraConfig, MT5Config, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NemotronConfig, NezhaConfig, NllbMoeConfig, NystromformerConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OmDetTurboConfig, OneFormerConfig, OpenAIGPTConfig, OPTConfig, Owlv2Config, OwlViTConfig, PaliGemmaConfig, PegasusConfig, PegasusXConfig, PerceiverConfig, PersimmonConfig, PhiConfig, Phi3Config, PhimoeConfig, Pix2StructConfig, PixtralVisionConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2_5OmniConfig, Qwen2_5_VLConfig, Qwen2AudioConfig, Qwen2MoeConfig, Qwen2VLConfig, Qwen3Config, Qwen3MoeConfig, RagConfig, RealmConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RetriBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, SeamlessM4TConfig, SeamlessM4Tv2Config, ShieldGemma2Config, SiglipConfig, Siglip2Config, SmolLM3Config, Speech2TextConfig, Speech2Text2Config, SpeechT5Config, SplinterConfig, SqueezeBertConfig, StableLmConfig, Starcoder2Config, SwitchTransformersConfig, T5Config, T5GemmaConfig, TapasConfig, TransfoXLConfig, TvpConfig, UdopConfig, UMT5Config, VideoLlavaConfig, ViltConfig, VipLlavaConfig, VisualBertConfig, VitsConfig, VoxtralConfig, Wav2Vec2Config, Wav2Vec2BertConfig, Wav2Vec2ConformerConfig, WhisperConfig, XCLIPConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, xLSTMConfig, XmodConfig, YosoConfig, ZambaConfig, Zamba2Config.