modelscope 下载的 qwen3 模型，后面用本地路径配置无法启动，其他视频模型可以

Open decaMinCow opened this issue 1 month ago • 1 comments

`2025-11-12 21:15:43,818 xinference.core.worker 142 INFO [request cdeb5224-c04f-11f0-8869-0242ac120008] Enter launch_builtin_model, args: <xinference.core.worker.WorkerActor object at 0x7fb4f7738ae0>, kwargs: model_uid=qwen3-0,model_name=qwen3,model_size_in_billions=8,model_format=ggufv2,quantization=Q4_K_M,model_engine=vLLM,model_type=LLM,n_gpu=auto,request_limits=None,peft_model_config=None,gpu_idx=None,download_hub=None,model_path=/data/modelscope/hub/unsloth/Qwen3-8B-GGUF/Qwen3-8B-Q4_K_M.gguf,xavier_config=None,enable_thinking=False,max_model_len=16384,gpu_memory_utilization=0.95 INFO 11-12 21:15:47 [init.py:239] Automatically detected platform cuda. 2025-11-12 21:15:51,025 xinference.core.model 3312 INFO Start requests handler. 2025-11-12 21:15:59,032 xinference.model.utils 3312 WARNING Attempt 1 failed. Remaining attempts: 2 2025-11-12 21:16:07,035 xinference.model.utils 3312 WARNING Attempt 2 failed. Remaining attempts: 1 2025-11-12 21:16:15,039 xinference.model.utils 3312 WARNING Attempt 3 failed. Remaining attempts: 0 2025-11-12 21:16:15,041 xinference.core.worker 142 ERROR Failed to load model qwen3-0 huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the local disk. Please check your internet connection and try again.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/xinference/core/worker.py", line 1114, in launch_builtin_model await model_ref.load() File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/context.py", line 262, in send return self._process_result_message(result) File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/context.py", line 111, in _process_result_message raise message.as_instanceof_cause() File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/pool.py", line 689, in send result = await self._run_coro(message.message_id, coro) File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/pool.py", line 389, in _run_coro return await coro File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 418, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 564, in on_receive File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive File "xoscar/core.pyx", line 527, in xoscar.core._BaseActor.on_receive File "xoscar/core.pyx", line 532, in xoscar.core._BaseActor.on_receive File "/opt/inference/xinference/core/model.py", line 476, in load await asyncio.to_thread(self._model.load) File "/usr/lib/python3.10/asyncio/threads.py", line 25, in to_thread return await loop.run_in_executor(None, func_call) File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/opt/inference/xinference/model/llm/vllm/core.py", line 375, in load self._preprocess_load_gguf() File "/opt/inference/xinference/model/llm/vllm/core.py", line 604, in _preprocess_load_gguf path = cache_model_tokenizer_and_config(self.model_family, non_quant_spec) File "/opt/inference/xinference/model/llm/llm_family.py", line 415, in cache_model_tokenizer_and_config download_dir = retry_download( File "/opt/inference/xinference/model/utils.py", line 143, in retry_download raise RuntimeError( RuntimeError: [address=0.0.0.0:40829, pid=3312] Failed to download model 'qwen3' (size: 8, format: pytorch) after multiple retries 2025-11-12 21:16:15,095 xinference.core.worker 142 ERROR [request cdeb5224-c04f-11f0-8869-0242ac120008] Leave launch_builtin_model, error: [address=0.0.0.0:40829, pid=3312] Failed to download model 'qwen3' (size: 8, format: pytorch) after multiple retries, elapsed time: 31 s huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the local disk. Please check your internet connection and try again.`

Nov 13 '25 05:11 decaMinCow

是用 vllm 跑 gguf ？我建议用 llamacpp 跑。

Nov 14 '25 07:11 qinxuye