inference icon indicating copy to clipboard operation
inference copied to clipboard

BUG: Model not found in the model list, uid: {model_uid}

Open zhengr opened this issue 1 year ago • 9 comments

Describe the bug.

raise ValueError(f"Model not found in the model list, uid: {model_uid}") ValueError: [address=127.0.0.1:18140, pid=771] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98

To Reproduce

The version of xinference you use: 0.8.5

Expected behavior

image

zhengr avatar Feb 15 '24 05:02 zhengr

2024-02-15 12:30:55,450 xinference.api.restful_api 772 ERROR [address=127.0.0.1:46891, pid=784] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98 Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/xinference/api/restful_api.py", line 542, in describe_model data = await (await self._get_supervisor_ref()).describe_model(model_uid) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/opt/conda/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/opt/conda/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 880, in describe_model raise ValueError(f"Model not found in the model list, uid: {model_uid}") ValueError: [address=127.0.0.1:46891, pid=784] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98

zhengr avatar Feb 15 '24 12:02 zhengr

If I launch model with xinference-local, why I get those kind of distributed deployment stuff.

@log_async(logger=logger) async def describe_model(self, model_uid: str) -> Dict[str, Any]: replica_info = self._model_uid_to_replica_info.get(model_uid, None) if replica_info is None: raise ValueError(f"Model not found in the model list, uid: {model_uid}") # Use rep id 0 to instead of next(replica_info.scheduler) to avoid # consuming the generator. replica_model_uid = build_replica_model_uid(model_uid, replica_info.replica, 0) worker_ref = self._replica_model_uid_to_worker.get(replica_model_uid, None) if worker_ref is None: raise ValueError( f"Model not found in the model list, uid: {replica_model_uid}" ) info = await worker_ref.describe_model(model_uid=replica_model_uid) info["replica"] = replica_info.replica return info

zhengr avatar Feb 15 '24 12:02 zhengr

from xinference.client import RESTfulClient client = RESTfulClient("http://127.0.0.1:9997") print(client.list_models())

error out info:

{'zephyr-7b-beta': {'model_type': 'LLM', 'address': '127.0.0.1:39873', 'accelerators': ['0', '1'], 'model_name': 'zephyr-7b-beta', 'model_lang': ['en'], 'model_ability': ['chat'], 'model_description': 'Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1', 'model_format': 'pytorch', 'model_size_in_billions': 7, 'model_family': 'zephyr-7b-beta', 'quantization': '4-bit', 'model_hub': 'huggingface', 'revision': '3bac358730f8806e5c3dc7c7e19eb36e045bf720', 'context_length': 8192}} 2024-02-16 14:59:33,086 xinference.core.supervisor 6525 DEBUG Enter list_models, args: (<xinference.core.supervisor.SupervisorActor object at 0x7d25f736cb30>,), kwargs: {} 2024-02-16 14:59:33,087 xinference.core.worker 6525 DEBUG Enter list_models, args: (<xinference.core.worker.WorkerActor object at 0x7d24c5baa610>,), kwargs: {} 2024-02-16 14:59:33,087 xinference.core.worker 6525 DEBUG Leave list_models, elapsed time: 0 s 2024-02-16 14:59:33,087 xinference.core.supervisor 6525 DEBUG Leave list_models, elapsed time: 0 s 2024-02-16 14:59:47,890 xinference.core.supervisor 6525 DEBUG Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7d25f736cb30>, 'e1613eb0-9f2f-11ee-afd3-573c38e3e261'), kwargs: {} 2024-02-16 14:59:47,892 xinference.api.restful_api 6509 ERROR [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: e1613eb0-9f2f-11ee-afd3-573c38e3e261 Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/xinference/api/restful_api.py", line 542, in describe_model data = await (await self._get_supervisor_ref()).describe_model(model_uid) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/opt/conda/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/opt/conda/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 880, in describe_model raise ValueError(f"Model not found in the model list, uid: {model_uid}") ValueError: [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: e1613eb0-9f2f-11ee-afd3-573c38e3e261 2024-02-16 14:59:48,127 xinference.core.supervisor 6525 DEBUG Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7d25f736cb30>, '3dc22fb0-740c-11ee-be13-452b73d8be98'), kwargs: {} 2024-02-16 14:59:48,130 xinference.api.restful_api 6509 ERROR [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98 Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/xinference/api/restful_api.py", line 542, in describe_model data = await (await self._get_supervisor_ref()).describe_model(model_uid) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/opt/conda/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/opt/conda/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 880, in describe_model raise ValueError(f"Model not found in the model list, uid: {model_uid}") ValueError: [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98 2024-02-16 15:00:14,962 xinference.core.supervisor 6525 DEBUG Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7d25f736cb30>, 'e1613eb0-9f2f-11ee-afd3-573c38e3e261'), kwargs: {} 2024-02-16 15:00:14,964 xinference.api.restful_api 6509 ERROR [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: e1613eb0-9f2f-11ee-afd3-573c38e3e261 Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/xinference/api/restful_api.py", line 542, in describe_model data = await (await self._get_supervisor_ref()).describe_model(model_uid) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/opt/conda/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/opt/conda/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 880, in describe_model raise ValueError(f"Model not found in the model list, uid: {model_uid}") ValueError: [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: e1613eb0-9f2f-11ee-afd3-573c38e3e261 2024-02-16 15:00:15,224 xinference.core.supervisor 6525 DEBUG Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7d25f736cb30>, '3dc22fb0-740c-11ee-be13-452b73d8be98'), kwargs: {} 2024-02-16 15:00:15,226 xinference.api.restful_api 6509 ERROR [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98 Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/xinference/api/restful_api.py", line 542, in describe_model data = await (await self._get_supervisor_ref()).describe_model(model_uid) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/opt/conda/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/opt/conda/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 880, in describe_model raise ValueError(f"Model not found in the model list, uid: {model_uid}") ValueError: [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98 2024-02-16 15:00:55,887 xinference.core.supervisor 6525 DEBUG Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7d25f736cb30>, 'e1613eb0-9f2f-11ee-afd3-573c38e3e261'), kwargs: {} 2024-02-16 15:00:55,890 xinference.api.restful_api 6509 ERROR [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: e1613eb0-9f2f-11ee-afd3-573c38e3e261 Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/xinference/api/restful_api.py", line 542, in describe_model data = await (await self._get_supervisor_ref()).describe_model(model_uid) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/opt/conda/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/opt/conda/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 880, in describe_model raise ValueError(f"Model not found in the model list, uid: {model_uid}") ValueError: [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: e1613eb0-9f2f-11ee-afd3-573c38e3e261 2024-02-16 15:00:56,130 xinference.core.supervisor 6525 DEBUG Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7d25f736cb30>, '3dc22fb0-740c-11ee-be13-452b73d8be98'), kwargs: {} 2024-02-16 15:00:56,132 xinference.api.restful_api 6509 ERROR [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98 Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/xinference/api/restful_api.py", line 542, in describe_model data = await (await self._get_supervisor_ref()).describe_model(model_uid) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/opt/conda/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/opt/conda/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 880, in describe_model raise ValueError(f"Model not found in the model list, uid: {model_uid}") ValueError: [address=127.0.0.1:54988, pid=6525] Model not found in the model list, uid: 3dc22fb0-740c-11ee-be13-452b73d8be98

zhengr avatar Feb 16 '24 15:02 zhengr

I also encountered the same issue, and this problem causes the model not to return any results. Awaiting solutions online, thank you. The content below is the output information from the log.

File "/usr/local/lib/python3.10/dist-packages/xinference/core/worker.py", line 654, in describe_model raise ValueError(f"Model not found in the model list, uid: {model_uid}")

kerin0364 avatar Mar 18 '24 02:03 kerin0364

same issue, and the rerank model

ghgggg avatar Apr 17 '24 09:04 ghgggg

same issue, and the rerank model

怎么解决的呢?

sammichenVV avatar Apr 28 '24 10:04 sammichenVV

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Aug 07 '24 19:08 github-actions[bot]

一样的bug,我是已经启动了embedding model,但是在通过openai接口调用embedding模型偶尔会出现这样的报错:“openai.BadRequestError: Error code: 400 - {'detail': '[address=0.0.0.0:45483, pid=5944] Model not found, uid: bce-embedding-base_v2-1-0'}”。 而我启用的embedding model id是:bce-embedding-base_v2。这就很迷惑? 希望有大佬提出解决办法。

A23LZQ avatar Aug 09 '24 10:08 A23LZQ

我的xinference版本是0.12.3,分布式部署的也存在这个问题。通过界面化操作注册完成qwen2-72b-awq模型后,然后启动模型时,提示 Model not found, uid: qwen2-72b-awq-1-0,但是我的model uid应该是qwen2-72b-awq。后台日志也没有报错,跪求解决方案。

EnzoLiang avatar Aug 12 '24 01:08 EnzoLiang

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Aug 19 '24 19:08 github-actions[bot]

This issue was closed because it has been inactive for 5 days since being marked as stale.

github-actions[bot] avatar Aug 25 '24 19:08 github-actions[bot]