inference 无法正常启动模型

System Info / 系統信息

前端页面上找不到模型，应该是没有正常启动，postman也无法请求到模型

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[ ] docker / docker
[x] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

os:centos7 version: vllm 0.8.4 xinference 1.5.0.post1

切换到1.4版本是正常启动的，不过这样的花GOT-OCR2 transformer版本问题就出现了

The command used to start Xinference / 用以启动 xinference 的命令

XINFERENCE_ENABLE_VIRTUAL_ENV=1 nohup xinference-local --host --port &

Apr 21 '25 09:04 GXKIM

是 GOT-OCR2 还是 vl 我看你有俩截图？

另外，贴一下服务端日志。

Apr 21 '25 09:04 qinxuye

都，俩模型都启动了

Xuye Qin @.***>于2025年4月21日周一17:25写道：

是 GOT-OCR2 还是 vl 我看你有俩截图？

另外，贴一下服务端日志。

— Reply to this email directly, view it on GitHub https://github.com/xorbitsai/inference/issues/3312#issuecomment-2818024689, or unsubscribe https://github.com/notifications/unsubscribe-auth/A56NGINCKTLIEIX5JU7SPLT22S2QBAVCNFSM6AAAAAB3Q5SUX6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJYGAZDINRYHE . You are receiving this because you authored the thread.Message ID: @.***> qinxuye left a comment (xorbitsai/inference#3312) https://github.com/xorbitsai/inference/issues/3312#issuecomment-2818024689

是 GOT-OCR2 还是 vl 我看你有俩截图？

另外，贴一下服务端日志。

— Reply to this email directly, view it on GitHub https://github.com/xorbitsai/inference/issues/3312#issuecomment-2818024689, or unsubscribe https://github.com/notifications/unsubscribe-auth/A56NGINCKTLIEIX5JU7SPLT22S2QBAVCNFSM6AAAAAB3Q5SUX6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJYGAZDINRYHE . You are receiving this because you authored the thread.Message ID: @.***>

Apr 21 '25 09:04 GXKIM

都看不到启动了吗？

Apr 21 '25 09:04 qinxuye

那是你之前有一个 9997 把。看下，把全部都停了重新试下。

Apr 21 '25 09:04 qinxuye

那是你之前有一个 9997 把。看下，把全部都停了重新试下。

###command XINFERENCE_ENABLE_VIRTUAL_ENV=1 nohup xinference-local --host 0.0.0.0 --port 8007 &

xinference launch --model-name GOT-OCR2_0 --model-type image --model_path /models/GOT-OCR2_0 --gpu-idx 1 -e http://xxxx:8007

###log The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xinference/core/utils.py", line 93, in wrapped ret = await func(*args, **kwargs) File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xinference/core/worker.py", line 1135, in launch_builtin_model await model_ref.load() File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xoscar/backends/context.py", line 262, in send return self._process_result_message(result) File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xoscar/backends/context.py", line 111, in _process_result_message raise message.as_instanceof_cause() File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xoscar/backends/pool.py", line 689, in send result = await self._run_coro(message.message_id, coro) File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xoscar/backends/pool.py", line 389, in _run_coro return await coro File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 564, in on_receive raise ex File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 527, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 532, in xoscar.core._BaseActor.on_receive result = await result File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xinference/core/model.py", line 471, in load await asyncio.to_thread(self._model.load) File "/data/miniconda3/envs/vllm/lib/python3.10/asyncio/threads.py", line 25, in to_thread return await loop.run_in_executor(None, func_call) File "/data/miniconda3/envs/vllm/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/xinference/model/image/ocr/got_ocr2.py", line 56, in load model = AutoModel.from_pretrained( File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 531, in from_pretrained config, kwargs = AutoConfig.from_pretrained( File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1123, in from_pretrained config_class = get_class_from_dynamic_module( File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 570, in get_class_from_dynamic_module return get_class_in_module(class_name, final_module, force_reload=force_download) File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 267, in get_class_in_module module_spec.loader.exec_module(module) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/data/ai/user/.cache/huggingface/modules/transformers_modules/GOT-OCR2_0/modeling_GOT.py", line 1, in from transformers import Qwen2Config, Qwen2Model, Qwen2ForCausalLM, StoppingCriteria, TextStreamer File "", line 1075, in _handle_fromlist File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1956, in getattr value = getattr(module, name) File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1955, in getattr module = self._get_module(self._class_to_module[name]) File "/data/miniconda3/envs/vllm/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1969, in _get_module raise RuntimeError( RuntimeError: [address=0.0.0.0:17538, pid=50968] Failed to import transformers.models.qwen2.modeling_qwen2 because of the following error (look up to see its traceback): cannot import name 'log' from 'torch.distributed.elastic.agent.server.api' (/data/miniconda3/envs/vllm/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py) 2025-04-22 08:54:20,841 uvicorn.access 7525 INFO - "GET /v1/models/instances?model_name=GOT-OCR2_0&model_uid=GOT-OCR2 HTTP/1.1" 200