[Bug] There is no module or parameter named 'gating' in InternVLChatMode
Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
(VllmWorker TP1 pid=1825663) INFO 09-30 16:54:31 [cuda.py:328] Using Flash Attention backend on V1 engine. Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s] Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:00<00:02, 1.20it/s] Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:01<00:01, 1.09it/s] Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:02<00:00, 1.03it/s] (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] WorkerProc failed to start. (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] Traceback (most recent call last): (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 533, in worker_main (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] worker = WorkerProc(*args, **kwargs) (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 402, in __init__ (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] self.worker.load_model() (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/v1/worker/gpu_worker.py", line 212, in load_model (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] self.model_runner.load_model(eep_scale_up=eep_scale_up) (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1986, in load_model (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] self.model = model_loader.load_model( (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] ^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/model_executor/model_loader/base_loader.py", line 49, in load_model (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] self.load_weights(model, model_config) (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/model_executor/model_loader/default_loader.py", line 259, in load_weights (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] loaded_weights = model.load_weights( (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] ^^^^^^^^^^^^^^^^^^^ (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/model_executor/models/internvl.py", line 1387, in load_weights (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] return loader.load_weights(weights) (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 291, in load_weights (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] autoloaded_weights = set(self._load_module("", self.module, weights)) (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 277, in _load_module (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] raise ValueError(msg) (VllmWorker TP0 pid=1825662) ERROR 09-30 16:54:34 [multiproc_executor.py:559] ValueError: There is no module or parameter named 'gating' in InternVLChatModel Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:03<00:01, 1.15s/it] (VllmWorker TP0 pid=1825662) (VllmWorker TP0 pid=1825662) INFO 09-30 16:54:34 [multiproc_executor.py:520] Parent process exited, terminating worker (VllmWorker TP1 pid=1825663) INFO 09-30 16:54:34 [multiproc_executor.py:520] Parent process exited, terminating worker [rank0]:[W930 16:54:35.298473628 ProcessGroupNCCL.cpp:1479] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) (EngineCore_0 pid=1825129) ERROR 09-30 16:54:36 [core.py:700] EngineCore failed to start. (EngineCore_0 pid=1825129) ERROR 09-30 16:54:36 [core.py:700] Traceback (most recent call last): (EngineCore_0 pid=1825129) ERROR 09-30 16:54:36 [core.py:700] File "/data/miniconda/envs/iva2/lib/python3.11/site-packages/vllm/v1/engine/core.py", line 691, in run_engine_core
Reproduction
python -m vllm.entrypoints.openai.api_server --model ${MODEL_NAME} --served-model-name ${SERVER_MODEL_NAME} --host 0.0.0.0 --port ${PORT} -tp ${PARALLEL_SIZE} --max-model-len 8000 --enable-chunked-prefill --gpu-memory-utilization 0.85 --max-num-seqs 20 --trust-remote-code
Environment
VLLM>=0.10.1
**InternVL3_5-8B-Flash/InternVL3_5-14B-Flash**
Error traceback