inference icon indicating copy to clipboard operation
inference copied to clipboard

BUG: load model failed

Open leechao1981 opened this issue 1 year ago • 13 comments

Describe the bug

load model failed

To Reproduce

To help us to reproduce this bug, please provide information below:

  1. Your Python version. 3.8
  2. newest
  3. MAc M1

2023-11-21 13:33:01,542 xinference.model.llm.llm_family 43126 INFO Caching from Hugging Face: Xorbits/chatglm2-6B-GGML 2023-11-21 13:33:04,408 xinference.core.worker 43126 ERROR Failed to load model 6fc5d2a0-882f-11ee-828a-197d0dfb2669-1-0 Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: Remote server unixsocket:///5652611072 closed 2023-11-21 13:33:04,413 xinference.api.restful_api 42623 ERROR [address=127.0.0.1:54717, pid=43126] Remote server unixsocket:///5652611072 closed Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/api/restful_api.py", line 489, in launch_model model_uid = await (await self._get_supervisor_ref()).launch_builtin_model( File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 326, in launch_builtin_model await _launch_one_model(rep_model_uid) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 301, in _launch_one_model await worker_ref.launch_builtin_model( File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper async with lock: File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/utils.py", line 27, in wrapped ret = await func(*args, **kwargs) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: [address=127.0.0.1:54717, pid=43126] Remote server unixsocket:///5652611072 closed 2023-11-21 13:33:04,417 uvicorn.access 42623 INFO 127.0.0.1:57062 - "POST /v1/models HTTP/1.1" 500 2023-11-21 13:33:28,973 xinference.model.llm.llm_family 43126 INFO Caching from Hugging Face: TheBloke/baichuan-llama-7B-GGML 2023-11-21 13:33:32,033 xinference.core.worker 43126 ERROR Failed to load model 8022f880-882f-11ee-828a-197d0dfb2669-1-0 Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: Remote server unixsocket:///11305222144 closed 2023-11-21 13:33:32,037 xinference.api.restful_api 42623 ERROR [address=127.0.0.1:54717, pid=43126] Remote server unixsocket:///11305222144 closed Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/api/restful_api.py", line 489, in launch_model model_uid = await (await self._get_supervisor_ref()).launch_builtin_model( File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 326, in launch_builtin_model await _launch_one_model(rep_model_uid) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 301, in _launch_one_model await worker_ref.launch_builtin_model( File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper async with lock: File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/utils.py", line 27, in wrapped ret = await func(*args, **kwargs) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: [address=127.0.0.1:54717, pid=43126] Remote server unixsocket:///11305222144 closed

leechao1981 avatar Nov 21 '23 05:11 leechao1981

Could you use xinference --log-level=debug to start service and run again? And paste all logs to get more information.

aresnow1 avatar Nov 21 '23 05:11 aresnow1

the consel: 2023-11-21 13:52:57,022 xinference.core.supervisor 50309 INFO Xinference supervisor 127.0.0.1:23432 started 2023-11-21 13:52:57,023 xinference.core.supervisor 50309 DEBUG Enter add_worker, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, '127.0.0.1:23432'), kwargs: {} 2023-11-21 13:52:57,023 xinference.core.supervisor 50309 DEBUG Worker 127.0.0.1:23432 has been added successfully 2023-11-21 13:52:57,023 xinference.core.supervisor 50309 DEBUG Leave add_worker, elapsed time: 0 s 2023-11-21 13:52:57,023 xinference.core.worker 50309 INFO Xinference worker 127.0.0.1:23432 started 2023-11-21 13:52:57,024 xinference.core.supervisor 50309 DEBUG Worker 127.0.0.1:23432 resources: {'cpu': ResourceStatus(available=0.0, total=8, memory_available=928112640, memory_total=17179869184)} 2023-11-21 13:52:58,066 xinference.core.supervisor 50309 DEBUG Enter get_status, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>,), kwargs: {} 2023-11-21 13:52:58,066 xinference.core.supervisor 50309 DEBUG Leave get_status, elapsed time: 0 s 2023-11-21 13:52:58,896 xinference.api.restful_api 50292 INFO Starting Xinference at endpoint: http://127.0.0.1:9997 2023-11-21 13:52:58,945 uvicorn.error 50292 INFO Uvicorn running on http://127.0.0.1:9997 (Press CTRL+C to quit) 2023-11-21 13:53:14,982 xinference.core.supervisor 50309 DEBUG Enter list_model_registrations, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM'), kwargs: {} 2023-11-21 13:53:14,983 xinference.core.supervisor 50309 DEBUG Leave list_model_registrations, elapsed time: 0 s 2023-11-21 13:53:15,395 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'baichuan'), kwargs: {} 2023-11-21 13:53:15,395 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,397 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'baichuan-2'), kwargs: {} 2023-11-21 13:53:15,398 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,406 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'baichuan-2-chat'), kwargs: {} 2023-11-21 13:53:15,406 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,408 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'baichuan-chat'), kwargs: {} 2023-11-21 13:53:15,408 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,410 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm'), kwargs: {} 2023-11-21 13:53:15,411 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,412 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm2'), kwargs: {} 2023-11-21 13:53:15,412 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,414 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm2-32k'), kwargs: {} 2023-11-21 13:53:15,414 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,416 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm3'), kwargs: {} 2023-11-21 13:53:15,417 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,420 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm3-32k'), kwargs: {} 2023-11-21 13:53:15,420 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,422 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'code-llama'), kwargs: {} 2023-11-21 13:53:15,423 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,425 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'code-llama-instruct'), kwargs: {} 2023-11-21 13:53:15,425 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,425 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'code-llama-python'), kwargs: {} 2023-11-21 13:53:15,425 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,427 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'falcon'), kwargs: {} 2023-11-21 13:53:15,427 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,429 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'falcon-instruct'), kwargs: {} 2023-11-21 13:53:15,430 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,432 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'glaive-coder'), kwargs: {} 2023-11-21 13:53:15,432 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,433 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'gpt-2'), kwargs: {} 2023-11-21 13:53:15,434 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,435 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'internlm-20b'), kwargs: {} 2023-11-21 13:53:15,435 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,437 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'internlm-7b'), kwargs: {} 2023-11-21 13:53:15,437 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,438 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'internlm-chat-20b'), kwargs: {} 2023-11-21 13:53:15,438 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,440 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'internlm-chat-7b'), kwargs: {} 2023-11-21 13:53:15,440 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,442 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'llama-2'), kwargs: {} 2023-11-21 13:53:15,442 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,444 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'llama-2-chat'), kwargs: {} 2023-11-21 13:53:15,444 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,445 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'mistral-instruct-v0.1'), kwargs: {} 2023-11-21 13:53:15,445 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,446 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'mistral-v0.1'), kwargs: {} 2023-11-21 13:53:15,447 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,447 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'OpenBuddy'), kwargs: {} 2023-11-21 13:53:15,447 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,447 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'opt'), kwargs: {} 2023-11-21 13:53:15,448 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,450 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'orca'), kwargs: {} 2023-11-21 13:53:15,450 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,451 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'qwen-chat'), kwargs: {} 2023-11-21 13:53:15,451 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,453 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'starchat-beta'), kwargs: {} 2023-11-21 13:53:15,453 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,454 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'starcoder'), kwargs: {} 2023-11-21 13:53:15,454 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,456 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'starcoderplus'), kwargs: {} 2023-11-21 13:53:15,456 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,456 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'tiny-llama'), kwargs: {} 2023-11-21 13:53:15,457 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,459 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'vicuna-v1.3'), kwargs: {} 2023-11-21 13:53:15,459 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,460 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'vicuna-v1.5'), kwargs: {} 2023-11-21 13:53:15,460 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,462 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'vicuna-v1.5-16k'), kwargs: {} 2023-11-21 13:53:15,463 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,463 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'wizardcoder-python-v1.0'), kwargs: {} 2023-11-21 13:53:15,464 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,465 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'wizardlm-v1.0'), kwargs: {} 2023-11-21 13:53:15,465 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,466 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'wizardmath-v1.0'), kwargs: {} 2023-11-21 13:53:15,466 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,467 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'zephyr-7b-alpha'), kwargs: {} 2023-11-21 13:53:15,467 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,469 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'zephyr-7b-beta'), kwargs: {} 2023-11-21 13:53:15,470 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:28,723 xinference.core.supervisor 50309 DEBUG Enter launch_builtin_model, model_uid: 4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf, model_name: chatglm2, model_size: 6, model_format: ggmlv3, quantization: q4_0, replica: 1 2023-11-21 13:53:28,724 xinference.core.worker 50309 DEBUG Enter get_model_count, args: (<xinference.core.worker.WorkerActor object at 0x7fc85927bc20>,), kwargs: {} 2023-11-21 13:53:28,724 xinference.core.worker 50309 DEBUG Leave get_model_count, elapsed time: 0 s 2023-11-21 13:53:28,724 xinference.core.worker 50309 DEBUG Enter launch_builtin_model, args: (<xinference.core.worker.WorkerActor object at 0x7fc85927bc20>,), kwargs: {'model_uid': '4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf-1-0', 'model_name': 'chatglm2', 'model_size_in_billions': 6, 'model_format': 'ggmlv3', 'quantization': 'q4_0', 'model_type': 'LLM', 'n_gpu': 'auto'} 2023-11-21 13:53:28,725 xinference.core.supervisor 50309 DEBUG Enter is_local_deployment, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>,), kwargs: {} 2023-11-21 13:53:28,725 xinference.core.supervisor 50309 DEBUG Leave is_local_deployment, elapsed time: 0 s 2023-11-21 13:53:28,736 xinference.model.llm.llm_family 50309 INFO Caching from Hugging Face: Xorbits/chatglm2-6B-GGML 2023-11-21 13:53:28,736 xinference.model.llm.core 50309 DEBUG Launching 4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf-1-0 with ChatglmCppChatModel 2023-11-21 13:53:31,318 xinference.core.worker 50309 ERROR Failed to load model 4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf-1-0 Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: Remote server unixsocket:///6594101248 closed 2023-11-21 13:53:31,322 xinference.core.supervisor 50309 DEBUG Enter terminate_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, '4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf'), kwargs: {'suppress_exception': True} 2023-11-21 13:53:31,322 xinference.core.supervisor 50309 DEBUG Leave terminate_model, elapsed time: 0 s 2023-11-21 13:53:31,324 xinference.api.restful_api 50292 ERROR [address=127.0.0.1:23432, pid=50309] Remote server unixsocket:///6594101248 closed Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/api/restful_api.py", line 489, in launch_model model_uid = await (await self._get_supervisor_ref()).launch_builtin_model( File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 326, in launch_builtin_model await _launch_one_model(rep_model_uid) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 301, in _launch_one_model await worker_ref.launch_builtin_model( File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper async with lock: File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/utils.py", line 27, in wrapped ret = await func(*args, **kwargs) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: [address=127.0.0.1:23432, pid=50309] Remote server unixsocket:///6594101248 closed

logfile: 2023-11-21 13:52:55,043 asyncio 50292 DEBUG Using selector: KqueueSelector 2023-11-21 13:52:57,021 xoscar.metrics.api 50309 DEBUG Finished initialize the metrics of backend: console. 2023-11-21 13:52:57,022 xinference.core.supervisor 50309 INFO Xinference supervisor 127.0.0.1:23432 started 2023-11-21 13:52:57,023 xinference.core.supervisor 50309 DEBUG Enter add_worker, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, '127.0.0.1:23432'), kwargs: {} 2023-11-21 13:52:57,023 xinference.core.supervisor 50309 DEBUG Worker 127.0.0.1:23432 has been added successfully 2023-11-21 13:52:57,023 xinference.core.supervisor 50309 DEBUG Leave add_worker, elapsed time: 0 s 2023-11-21 13:52:57,023 xinference.core.worker 50309 INFO Xinference worker 127.0.0.1:23432 started 2023-11-21 13:52:57,024 xinference.core.supervisor 50309 DEBUG Worker 127.0.0.1:23432 resources: {'cpu': ResourceStatus(available=0.0, total=8, memory_available=928112640, memory_total=17179869184)} 2023-11-21 13:52:58,066 xinference.core.supervisor 50309 DEBUG Enter get_status, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>,), kwargs: {} 2023-11-21 13:52:58,066 xinference.core.supervisor 50309 DEBUG Leave get_status, elapsed time: 0 s 2023-11-21 13:52:58,209 matplotlib 50292 DEBUG matplotlib data path: /Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/matplotlib/mpl-data 2023-11-21 13:52:58,214 matplotlib 50292 DEBUG CONFIGDIR=/Users/lichao/.matplotlib 2023-11-21 13:52:58,214 matplotlib 50292 DEBUG interactive is False 2023-11-21 13:52:58,215 matplotlib 50292 DEBUG platform is darwin 2023-11-21 13:52:58,228 urllib3.connectionpool 50292 DEBUG Starting new HTTPS connection (1): api.gradio.app:443 2023-11-21 13:52:58,467 matplotlib 50292 DEBUG CACHEDIR=/Users/lichao/.matplotlib 2023-11-21 13:52:58,469 matplotlib.font_manager 50292 DEBUG Using fontManager instance from /Users/lichao/.matplotlib/fontlist-v330.json 2023-11-21 13:52:58,623 httpx 50292 DEBUG load_ssl_context verify=True cert=None trust_env=True http2=False 2023-11-21 13:52:58,624 httpx 50292 DEBUG load_verify_locations cafile='/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/certifi/cacert.pem' 2023-11-21 13:52:58,661 PIL.Image 50292 DEBUG Importing BlpImagePlugin 2023-11-21 13:52:58,662 PIL.Image 50292 DEBUG Importing BmpImagePlugin 2023-11-21 13:52:58,663 PIL.Image 50292 DEBUG Importing BufrStubImagePlugin 2023-11-21 13:52:58,663 PIL.Image 50292 DEBUG Importing CurImagePlugin 2023-11-21 13:52:58,663 PIL.Image 50292 DEBUG Importing DcxImagePlugin 2023-11-21 13:52:58,663 PIL.Image 50292 DEBUG Importing DdsImagePlugin 2023-11-21 13:52:58,664 PIL.Image 50292 DEBUG Importing EpsImagePlugin 2023-11-21 13:52:58,664 PIL.Image 50292 DEBUG Importing FitsImagePlugin 2023-11-21 13:52:58,665 PIL.Image 50292 DEBUG Importing FliImagePlugin 2023-11-21 13:52:58,665 PIL.Image 50292 DEBUG Importing FpxImagePlugin 2023-11-21 13:52:58,665 PIL.Image 50292 DEBUG Image: failed to import FpxImagePlugin: No module named 'olefile' 2023-11-21 13:52:58,665 PIL.Image 50292 DEBUG Importing FtexImagePlugin 2023-11-21 13:52:58,665 PIL.Image 50292 DEBUG Importing GbrImagePlugin 2023-11-21 13:52:58,666 PIL.Image 50292 DEBUG Importing GifImagePlugin 2023-11-21 13:52:58,666 PIL.Image 50292 DEBUG Importing GribStubImagePlugin 2023-11-21 13:52:58,667 PIL.Image 50292 DEBUG Importing Hdf5StubImagePlugin 2023-11-21 13:52:58,667 PIL.Image 50292 DEBUG Importing IcnsImagePlugin 2023-11-21 13:52:58,668 PIL.Image 50292 DEBUG Importing IcoImagePlugin 2023-11-21 13:52:58,668 PIL.Image 50292 DEBUG Importing ImImagePlugin 2023-11-21 13:52:58,669 PIL.Image 50292 DEBUG Importing ImtImagePlugin 2023-11-21 13:52:58,669 PIL.Image 50292 DEBUG Importing IptcImagePlugin 2023-11-21 13:52:58,669 PIL.Image 50292 DEBUG Importing JpegImagePlugin 2023-11-21 13:52:58,670 PIL.Image 50292 DEBUG Importing Jpeg2KImagePlugin 2023-11-21 13:52:58,670 PIL.Image 50292 DEBUG Importing McIdasImagePlugin 2023-11-21 13:52:58,670 PIL.Image 50292 DEBUG Importing MicImagePlugin 2023-11-21 13:52:58,671 PIL.Image 50292 DEBUG Image: failed to import MicImagePlugin: No module named 'olefile' 2023-11-21 13:52:58,671 PIL.Image 50292 DEBUG Importing MpegImagePlugin 2023-11-21 13:52:58,671 PIL.Image 50292 DEBUG Importing MpoImagePlugin 2023-11-21 13:52:58,673 PIL.Image 50292 DEBUG Importing MspImagePlugin 2023-11-21 13:52:58,673 PIL.Image 50292 DEBUG Importing PalmImagePlugin 2023-11-21 13:52:58,674 PIL.Image 50292 DEBUG Importing PcdImagePlugin 2023-11-21 13:52:58,674 PIL.Image 50292 DEBUG Importing PcxImagePlugin 2023-11-21 13:52:58,674 PIL.Image 50292 DEBUG Importing PdfImagePlugin 2023-11-21 13:52:58,678 PIL.Image 50292 DEBUG Importing PixarImagePlugin 2023-11-21 13:52:58,678 PIL.Image 50292 DEBUG Importing PngImagePlugin 2023-11-21 13:52:58,678 PIL.Image 50292 DEBUG Importing PpmImagePlugin 2023-11-21 13:52:58,679 PIL.Image 50292 DEBUG Importing PsdImagePlugin 2023-11-21 13:52:58,679 PIL.Image 50292 DEBUG Importing QoiImagePlugin 2023-11-21 13:52:58,679 PIL.Image 50292 DEBUG Importing SgiImagePlugin 2023-11-21 13:52:58,679 PIL.Image 50292 DEBUG Importing SpiderImagePlugin 2023-11-21 13:52:58,680 PIL.Image 50292 DEBUG Importing SunImagePlugin 2023-11-21 13:52:58,680 PIL.Image 50292 DEBUG Importing TgaImagePlugin 2023-11-21 13:52:58,680 PIL.Image 50292 DEBUG Importing TiffImagePlugin 2023-11-21 13:52:58,680 PIL.Image 50292 DEBUG Importing WebPImagePlugin 2023-11-21 13:52:58,689 PIL.Image 50292 DEBUG Importing WmfImagePlugin 2023-11-21 13:52:58,689 PIL.Image 50292 DEBUG Importing XbmImagePlugin 2023-11-21 13:52:58,690 PIL.Image 50292 DEBUG Importing XpmImagePlugin 2023-11-21 13:52:58,690 PIL.Image 50292 DEBUG Importing XVThumbImagePlugin 2023-11-21 13:52:58,896 xinference.api.restful_api 50292 INFO Starting Xinference at endpoint: http://127.0.0.1:9997 2023-11-21 13:52:58,943 uvicorn.error 50292 INFO Started server process [50292] 2023-11-21 13:52:58,943 uvicorn.error 50292 INFO Waiting for application startup. 2023-11-21 13:52:58,944 uvicorn.error 50292 INFO Application startup complete. 2023-11-21 13:52:58,945 uvicorn.error 50292 INFO Uvicorn running on http://127.0.0.1:9997 (Press CTRL+C to quit) 2023-11-21 13:52:59,066 urllib3.connectionpool 50292 DEBUG https://api.gradio.app:443 "GET /gradio-messaging/en HTTP/1.1" 200 3 2023-11-21 13:53:14,795 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET / HTTP/1.1" 304 2023-11-21 13:53:14,982 xinference.core.supervisor 50309 DEBUG Enter list_model_registrations, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM'), kwargs: {} 2023-11-21 13:53:14,983 xinference.core.supervisor 50309 DEBUG Leave list_model_registrations, elapsed time: 0 s 2023-11-21 13:53:14,991 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET /v1/model_registrations/LLM HTTP/1.1" 200 2023-11-21 13:53:15,395 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'baichuan'), kwargs: {} 2023-11-21 13:53:15,395 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,397 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'baichuan-2'), kwargs: {} 2023-11-21 13:53:15,398 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,400 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET /v1/model_registrations/LLM/baichuan HTTP/1.1" 200 2023-11-21 13:53:15,406 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'baichuan-2-chat'), kwargs: {} 2023-11-21 13:53:15,406 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,408 uvicorn.access 50292 INFO 127.0.0.1:58402 - "GET /v1/model_registrations/LLM/baichuan-2 HTTP/1.1" 200 2023-11-21 13:53:15,408 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'baichuan-chat'), kwargs: {} 2023-11-21 13:53:15,408 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,410 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm'), kwargs: {} 2023-11-21 13:53:15,411 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,412 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm2'), kwargs: {} 2023-11-21 13:53:15,412 uvicorn.access 50292 INFO 127.0.0.1:58405 - "GET /v1/model_registrations/LLM/baichuan-2-chat HTTP/1.1" 200 2023-11-21 13:53:15,412 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,414 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm2-32k'), kwargs: {} 2023-11-21 13:53:15,414 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,414 uvicorn.access 50292 INFO 127.0.0.1:58406 - "GET /v1/model_registrations/LLM/baichuan-chat HTTP/1.1" 200 2023-11-21 13:53:15,416 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm3'), kwargs: {} 2023-11-21 13:53:15,417 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,417 uvicorn.access 50292 INFO 127.0.0.1:58407 - "GET /v1/model_registrations/LLM/chatglm HTTP/1.1" 200 2023-11-21 13:53:15,420 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'chatglm3-32k'), kwargs: {} 2023-11-21 13:53:15,420 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,421 uvicorn.access 50292 INFO 127.0.0.1:58408 - "GET /v1/model_registrations/LLM/chatglm2 HTTP/1.1" 200 2023-11-21 13:53:15,422 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'code-llama'), kwargs: {} 2023-11-21 13:53:15,423 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,423 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET /v1/model_registrations/LLM/chatglm2-32k HTTP/1.1" 200 2023-11-21 13:53:15,425 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'code-llama-instruct'), kwargs: {} 2023-11-21 13:53:15,425 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,425 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'code-llama-python'), kwargs: {} 2023-11-21 13:53:15,425 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,426 uvicorn.access 50292 INFO 127.0.0.1:58402 - "GET /v1/model_registrations/LLM/chatglm3 HTTP/1.1" 200 2023-11-21 13:53:15,427 uvicorn.access 50292 INFO 127.0.0.1:58405 - "GET /v1/model_registrations/LLM/chatglm3-32k HTTP/1.1" 200 2023-11-21 13:53:15,427 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'falcon'), kwargs: {} 2023-11-21 13:53:15,427 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,428 uvicorn.access 50292 INFO 127.0.0.1:58406 - "GET /v1/model_registrations/LLM/code-llama HTTP/1.1" 200 2023-11-21 13:53:15,429 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'falcon-instruct'), kwargs: {} 2023-11-21 13:53:15,430 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,430 uvicorn.access 50292 INFO 127.0.0.1:58407 - "GET /v1/model_registrations/LLM/code-llama-instruct HTTP/1.1" 200 2023-11-21 13:53:15,432 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'glaive-coder'), kwargs: {} 2023-11-21 13:53:15,432 uvicorn.access 50292 INFO 127.0.0.1:58408 - "GET /v1/model_registrations/LLM/code-llama-python HTTP/1.1" 200 2023-11-21 13:53:15,432 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,433 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'gpt-2'), kwargs: {} 2023-11-21 13:53:15,434 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,434 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET /v1/model_registrations/LLM/falcon HTTP/1.1" 200 2023-11-21 13:53:15,435 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'internlm-20b'), kwargs: {} 2023-11-21 13:53:15,435 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,436 uvicorn.access 50292 INFO 127.0.0.1:58402 - "GET /v1/model_registrations/LLM/falcon-instruct HTTP/1.1" 200 2023-11-21 13:53:15,437 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'internlm-7b'), kwargs: {} 2023-11-21 13:53:15,437 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,437 uvicorn.access 50292 INFO 127.0.0.1:58405 - "GET /v1/model_registrations/LLM/glaive-coder HTTP/1.1" 200 2023-11-21 13:53:15,438 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'internlm-chat-20b'), kwargs: {} 2023-11-21 13:53:15,438 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,439 uvicorn.access 50292 INFO 127.0.0.1:58406 - "GET /v1/model_registrations/LLM/gpt-2 HTTP/1.1" 200 2023-11-21 13:53:15,440 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'internlm-chat-7b'), kwargs: {} 2023-11-21 13:53:15,440 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,441 uvicorn.access 50292 INFO 127.0.0.1:58407 - "GET /v1/model_registrations/LLM/internlm-20b HTTP/1.1" 200 2023-11-21 13:53:15,442 uvicorn.access 50292 INFO 127.0.0.1:58408 - "GET /v1/model_registrations/LLM/internlm-7b HTTP/1.1" 200 2023-11-21 13:53:15,442 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'llama-2'), kwargs: {} 2023-11-21 13:53:15,442 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,442 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET /v1/model_registrations/LLM/internlm-chat-20b HTTP/1.1" 200 2023-11-21 13:53:15,443 uvicorn.access 50292 INFO 127.0.0.1:58402 - "GET /v1/model_registrations/LLM/internlm-chat-7b HTTP/1.1" 200 2023-11-21 13:53:15,444 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'llama-2-chat'), kwargs: {} 2023-11-21 13:53:15,444 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,445 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'mistral-instruct-v0.1'), kwargs: {} 2023-11-21 13:53:15,445 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,446 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'mistral-v0.1'), kwargs: {} 2023-11-21 13:53:15,447 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,447 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'OpenBuddy'), kwargs: {} 2023-11-21 13:53:15,447 uvicorn.access 50292 INFO 127.0.0.1:58405 - "GET /v1/model_registrations/LLM/llama-2 HTTP/1.1" 200 2023-11-21 13:53:15,447 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,447 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'opt'), kwargs: {} 2023-11-21 13:53:15,448 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,448 uvicorn.access 50292 INFO 127.0.0.1:58406 - "GET /v1/model_registrations/LLM/llama-2-chat HTTP/1.1" 200 2023-11-21 13:53:15,450 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'orca'), kwargs: {} 2023-11-21 13:53:15,450 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,450 uvicorn.access 50292 INFO 127.0.0.1:58407 - "GET /v1/model_registrations/LLM/mistral-instruct-v0.1 HTTP/1.1" 200 2023-11-21 13:53:15,451 uvicorn.access 50292 INFO 127.0.0.1:58408 - "GET /v1/model_registrations/LLM/mistral-v0.1 HTTP/1.1" 200 2023-11-21 13:53:15,451 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'qwen-chat'), kwargs: {} 2023-11-21 13:53:15,451 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,452 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET /v1/model_registrations/LLM/OpenBuddy HTTP/1.1" 200 2023-11-21 13:53:15,453 uvicorn.access 50292 INFO 127.0.0.1:58402 - "GET /v1/model_registrations/LLM/opt HTTP/1.1" 200 2023-11-21 13:53:15,453 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'starchat-beta'), kwargs: {} 2023-11-21 13:53:15,453 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,454 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'starcoder'), kwargs: {} 2023-11-21 13:53:15,454 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,454 uvicorn.access 50292 INFO 127.0.0.1:58405 - "GET /v1/model_registrations/LLM/orca HTTP/1.1" 200 2023-11-21 13:53:15,456 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'starcoderplus'), kwargs: {} 2023-11-21 13:53:15,456 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,456 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'tiny-llama'), kwargs: {} 2023-11-21 13:53:15,457 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,457 uvicorn.access 50292 INFO 127.0.0.1:58406 - "GET /v1/model_registrations/LLM/qwen-chat HTTP/1.1" 200 2023-11-21 13:53:15,459 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'vicuna-v1.3'), kwargs: {} 2023-11-21 13:53:15,459 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,459 uvicorn.access 50292 INFO 127.0.0.1:58407 - "GET /v1/model_registrations/LLM/starchat-beta HTTP/1.1" 200 2023-11-21 13:53:15,460 uvicorn.access 50292 INFO 127.0.0.1:58408 - "GET /v1/model_registrations/LLM/starcoder HTTP/1.1" 200 2023-11-21 13:53:15,460 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'vicuna-v1.5'), kwargs: {} 2023-11-21 13:53:15,460 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,461 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET /v1/model_registrations/LLM/starcoderplus HTTP/1.1" 200 2023-11-21 13:53:15,462 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'vicuna-v1.5-16k'), kwargs: {} 2023-11-21 13:53:15,462 uvicorn.access 50292 INFO 127.0.0.1:58402 - "GET /v1/model_registrations/LLM/tiny-llama HTTP/1.1" 200 2023-11-21 13:53:15,463 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,463 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'wizardcoder-python-v1.0'), kwargs: {} 2023-11-21 13:53:15,464 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,464 uvicorn.access 50292 INFO 127.0.0.1:58405 - "GET /v1/model_registrations/LLM/vicuna-v1.3 HTTP/1.1" 200 2023-11-21 13:53:15,465 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'wizardlm-v1.0'), kwargs: {} 2023-11-21 13:53:15,465 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,466 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'wizardmath-v1.0'), kwargs: {} 2023-11-21 13:53:15,466 uvicorn.access 50292 INFO 127.0.0.1:58406 - "GET /v1/model_registrations/LLM/vicuna-v1.5 HTTP/1.1" 200 2023-11-21 13:53:15,466 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,467 uvicorn.access 50292 INFO 127.0.0.1:58407 - "GET /v1/model_registrations/LLM/vicuna-v1.5-16k HTTP/1.1" 200 2023-11-21 13:53:15,467 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'zephyr-7b-alpha'), kwargs: {} 2023-11-21 13:53:15,467 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,468 uvicorn.access 50292 INFO 127.0.0.1:58408 - "GET /v1/model_registrations/LLM/wizardcoder-python-v1.0 HTTP/1.1" 200 2023-11-21 13:53:15,469 xinference.core.supervisor 50309 DEBUG Enter get_model_registration, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, 'LLM', 'zephyr-7b-beta'), kwargs: {} 2023-11-21 13:53:15,470 xinference.core.supervisor 50309 DEBUG Leave get_model_registration, elapsed time: 0 s 2023-11-21 13:53:15,470 uvicorn.access 50292 INFO 127.0.0.1:58401 - "GET /v1/model_registrations/LLM/wizardlm-v1.0 HTTP/1.1" 200 2023-11-21 13:53:15,471 uvicorn.access 50292 INFO 127.0.0.1:58402 - "GET /v1/model_registrations/LLM/wizardmath-v1.0 HTTP/1.1" 200 2023-11-21 13:53:15,471 uvicorn.access 50292 INFO 127.0.0.1:58405 - "GET /v1/model_registrations/LLM/zephyr-7b-alpha HTTP/1.1" 200 2023-11-21 13:53:15,472 uvicorn.access 50292 INFO 127.0.0.1:58406 - "GET /v1/model_registrations/LLM/zephyr-7b-beta HTTP/1.1" 200 2023-11-21 13:53:15,474 uvicorn.access 50292 INFO 127.0.0.1:58409 - "GET /manifest.json HTTP/1.1" 404 2023-11-21 13:53:28,723 xinference.core.supervisor 50309 DEBUG Enter launch_builtin_model, model_uid: 4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf, model_name: chatglm2, model_size: 6, model_format: ggmlv3, quantization: q4_0, replica: 1 2023-11-21 13:53:28,724 xinference.core.worker 50309 DEBUG Enter get_model_count, args: (<xinference.core.worker.WorkerActor object at 0x7fc85927bc20>,), kwargs: {} 2023-11-21 13:53:28,724 xinference.core.worker 50309 DEBUG Leave get_model_count, elapsed time: 0 s 2023-11-21 13:53:28,724 xinference.core.worker 50309 DEBUG Enter launch_builtin_model, args: (<xinference.core.worker.WorkerActor object at 0x7fc85927bc20>,), kwargs: {'model_uid': '4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf-1-0', 'model_name': 'chatglm2', 'model_size_in_billions': 6, 'model_format': 'ggmlv3', 'quantization': 'q4_0', 'model_type': 'LLM', 'n_gpu': 'auto'} 2023-11-21 13:53:28,725 xinference.core.supervisor 50309 DEBUG Enter is_local_deployment, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>,), kwargs: {} 2023-11-21 13:53:28,725 xinference.core.supervisor 50309 DEBUG Leave is_local_deployment, elapsed time: 0 s 2023-11-21 13:53:28,736 xinference.model.llm.llm_family 50309 INFO Caching from Hugging Face: Xorbits/chatglm2-6B-GGML 2023-11-21 13:53:28,736 xinference.model.llm.core 50309 DEBUG Launching 4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf-1-0 with ChatglmCppChatModel 2023-11-21 13:53:31,172 asyncio 50479 DEBUG Using selector: KqueueSelector 2023-11-21 13:53:31,271 xoscar.backends.pool 50479 DEBUG External address of process index 6594101248 updated to 127.0.0.1:58416 2023-11-21 13:53:31,271 xoscar.metrics.api 50479 DEBUG Finished initialize the metrics of backend: console. 2023-11-21 13:53:31,318 xinference.core.worker 50309 ERROR Failed to load model 4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf-1-0 Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: Remote server unixsocket:///6594101248 closed 2023-11-21 13:53:31,322 xinference.core.supervisor 50309 DEBUG Enter terminate_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fc85927ba40>, '4b3e4ae0-8832-11ee-aa53-67b2a7dbd4bf'), kwargs: {'suppress_exception': True} 2023-11-21 13:53:31,322 xinference.core.supervisor 50309 DEBUG Leave terminate_model, elapsed time: 0 s 2023-11-21 13:53:31,324 xinference.api.restful_api 50292 ERROR [address=127.0.0.1:23432, pid=50309] Remote server unixsocket:///6594101248 closed Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/api/restful_api.py", line 489, in launch_model model_uid = await (await self._get_supervisor_ref()).launch_builtin_model( File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 326, in launch_builtin_model await _launch_one_model(rep_model_uid) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 301, in _launch_one_model await worker_ref.launch_builtin_model( File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper async with lock: File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/utils.py", line 27, in wrapped ret = await func(*args, **kwargs) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: [address=127.0.0.1:23432, pid=50309] Remote server unixsocket:///6594101248 closed 2023-11-21 13:53:31,328 uvicorn.access 50292 INFO 127.0.0.1:58412 - "POST /v1/models HTTP/1.1" 500

leechao1981 avatar Nov 21 '23 05:11 leechao1981

pip show xinference and pip show chatglm_cpp to check these packages' version.

aresnow1 avatar Nov 21 '23 06:11 aresnow1

Name: xinference Version: 0.6.1 Summary: Model Serving Made Easy Home-page: https://github.com/xorbitsai/inference Author: Qin Xuye Author-email: [email protected] License: Apache License 2.0 Location: /Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages Requires: click, fastapi, fsspec, gradio, huggingface-hub, modelscope, pydantic, requests, s3fs, tabulate, tqdm, typing-extensions, uvicorn, xorbits, xoscar

Name: chatglm-cpp Version: 0.2.10 Summary: C++ implementation of ChatGLM-6B & ChatGLM2-6B Home-page: Author: Author-email: Jiahao Li [email protected] License: MIT License Location: /Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages

leechao1981 avatar Nov 21 '23 06:11 leechao1981

How did you install chatglm_cpp? It seems like use memory to serve and lead to OOM. If you want to use Metal to accelerate, you need to uninstall first and run CMAKE_ARGS="-DGGML_METAL=ON" pip install -U chatglm-cpp (https://github.com/li-plus/chatglm.cpp#python-binding) to enable Metal.

aresnow1 avatar Nov 21 '23 06:11 aresnow1

how to uninstall chatglm_cpp?

leechao1981 avatar Nov 21 '23 06:11 leechao1981

pip uninstall chatglm_cpp

aresnow1 avatar Nov 21 '23 06:11 aresnow1

我重装了chatglm_cpp,但是,仍然是这样

2023-11-21 17:11:27,384 xinference.model.llm.llm_family 8847 INFO Caching from Hugging Face: Xorbits/chatglm2-6B-GGML 2023-11-21 17:11:27,385 xinference.model.llm.core 8847 DEBUG Launching f3773800-884d-11ee-a2e5-6145fbabc579-1-0 with ChatglmCppChatModel 2023-11-21 17:11:30,979 xinference.core.worker 8847 ERROR Failed to load model f3773800-884d-11ee-a2e5-6145fbabc579-1-0 Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: Remote server unixsocket:///1159593984 closed 2023-11-21 17:11:30,982 xinference.core.supervisor 8847 DEBUG Enter terminate_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f8e40e130e0>, 'f3773800-884d-11ee-a2e5-6145fbabc579'), kwargs: {'suppress_exception': True} 2023-11-21 17:11:30,982 xinference.core.supervisor 8847 DEBUG Leave terminate_model, elapsed time: 0 s 2023-11-21 17:11:30,984 xinference.api.restful_api 8815 ERROR [address=127.0.0.1:16428, pid=8847] Remote server unixsocket:///1159593984 closed Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/api/restful_api.py", line 489, in launch_model model_uid = await (await self._get_supervisor_ref()).launch_builtin_model( File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 326, in launch_builtin_model await _launch_one_model(rep_model_uid) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 301, in _launch_one_model await worker_ref.launch_builtin_model( File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper async with lock: File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/utils.py", line 27, in wrapped ret = await func(*args, **kwargs) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: [address=127.0.0.1:16428, pid=8847] Remote server unixsocket:///1159593984 closed

leechao1981 avatar Nov 21 '23 09:11 leechao1981

Have you tried chatglm3?

aresnow1 avatar Nov 21 '23 12:11 aresnow1

额 一样

2023-11-22 09:53:50,097 xinference.core.worker 17816 ERROR Failed to load model 5de77f30-88d8-11ee-b7f9-81d878488341-1-0 Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: Remote server unixsocket:///2335178752 closed 2023-11-22 09:53:50,105 xinference.api.restful_api 17806 ERROR [address=127.0.0.1:30251, pid=17816] Remote server unixsocket:///2335178752 closed Traceback (most recent call last): File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/api/restful_api.py", line 489, in launch_model model_uid = await (await self._get_supervisor_ref()).launch_builtin_model( File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 657, in send result = await self._run_coro(message.message_id, coro) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/pool.py", line 368, in _run_coro return await coro File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/api.py", line 306, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 326, in launch_builtin_model await _launch_one_model(rep_model_uid) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/supervisor.py", line 301, in _launch_one_model await worker_ref.launch_builtin_model( File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper async with lock: File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper result = await result File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/utils.py", line 27, in wrapped ret = await func(*args, **kwargs) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xinference/core/worker.py", line 251, in launch_builtin_model await model_ref.load() File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 226, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/context.py", line 106, in _wait await asyncio.shield(future) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: [address=127.0.0.1:30251, pid=17816] Remote server unixsocket:///2335178752 closed

leechao1981 avatar Nov 22 '23 01:11 leechao1981

import chatglm_cpp
pipeline = chatglm_cpp.Pipeline("~/.xinference/cache/chatglm3-ggmlv3-6b/chatglm3-ggml-q4_0.bin")
pipeline.chat(["你好"])

Python 直接执行这个会报错吗?

aresnow1 avatar Nov 22 '23 07:11 aresnow1

import chatglm_cpp pipeline = chatglm_cpp.Pipeline("~/.xinference/cache/chatglm3-ggmlv3-6b/chatglm3-ggml-q4_0.bin") pipeline.chat(["你好"]) Traceback (most recent call last): File "", line 1, in File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/chatglm_cpp/init.py", line 24, in init convert(f, model_path, dtype=dtype) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/chatglm_cpp/convert.py", line 469, in convert tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 718, in from_pretrained tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 550, in get_tokenizer_config resolved_config_file = cached_file( File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/transformers/utils/hub.py", line 430, in cached_file resolved_file = hf_hub_download( File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn validate_repo_id(arg_value) File "/Users/lichao/opt/anaconda3/envs/AICHAT/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id raise HFValidationError( huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '~/.xinference/cache/chatglm3-ggmlv3-6b/chatglm3-ggml-q4_0.bin'. Use repo_type argument if needed. pipeline.chat(["你好"]) Traceback (most recent call last): File "", line 1, in NameError: name 'pipeline' is not defined

leechao1981 avatar Nov 22 '23 10:11 leechao1981

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Aug 08 '24 19:08 github-actions[bot]

This issue was closed because it has been inactive for 5 days since being marked as stale.

github-actions[bot] avatar Aug 13 '24 19:08 github-actions[bot]