byzer-llm icon indicating copy to clipboard operation
byzer-llm copied to clipboard

按照文档运行 bge 模型不正常

Open shell-nlp opened this issue 1 year ago • 2 comments

代码

import ray
from byzerllm.utils.client import ByzerLLM,LLMRequest,InferBackend
ray.init(address="auto",namespace="default",ignore_reinit_error=True)
llm = ByzerLLM()

llm.setup_gpus_per_worker(0.4).setup_num_workers(2).setup_infer_backend(InferBackend.Transformers)
llm.deploy(
    model_path="/home/dev/model/assets/embeddings/BAAI/bge-base-zh-v1.5/",
    pretrained_model_type="custom/bge",
    udf_name="emb",
    infer_params={}
)   

Failed to look up actor with name 'emb'. This could because 1. You are trying to look up a named actor you didn't create. 2. The named actor died. 3. You did not use a namespace matching the namespace of the actor.

Traceback (most recent call last): File "/home/dev/liuyu/project/llm-os/tests/ray_learn/模型.py", line 7, in llm.deploy( File "/home/dev/anaconda3/envs/vllm/lib/python3.10/site-packages/byzerllm/utils/client/init.py", line 681, in deploy UDFBuilder.build(self.ray_context,init_model,getattr(predict_module,predict_func)) File "/home/dev/anaconda3/envs/vllm/lib/python3.10/site-packages/pyjava/udf/init.py", line 211, in build ray.get(temp_udf_master.create_workers.remote(conf)) File "/home/dev/anaconda3/envs/vllm/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/home/dev/anaconda3/envs/vllm/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, **kwargs) File "/home/dev/anaconda3/envs/vllm/lib/python3.10/site-packages/ray/_private/worker.py", line 2626, in get raise value ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task. class_name: UDFMaster actor_id: b440901c458825c8c30d666f0a000000 pid: 3395398 name: emb namespace: default ip: 192.168.102.19 The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors. The actor never ran - it was cancelled before it started running.

shell-nlp avatar Jan 03 '24 08:01 shell-nlp