inference
inference copied to clipboard
BUG xinference部署bge rerank模型经常出现服务错误
Describe the bug
用xinference0.10.2.post1版本部署了bge-reranker-v2-m3模型,但经常出现如下错误,并且经常会出现服务无法响应的问题,需要重启才能解决:
xinference.api.restful_api 22480 ERROR [address=0.0.0.0:44871, pid=22579] 'float' object is not subscriptable
Traceback (most recent call last):
File "/app/anaconda3/lib/python3.11/site-packages/xinference/api/restful_api.py", line 1067, in rerank
scores = await model.rerank(
^^^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send
return self._process_result_message(result)
...
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 79, in wrapped_func
ret = await fn(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 418, in rerank
return await self._call_wrapper(
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 103, in _async_wrapper
return await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 333, in _call_wrapper
ret = await asyncio.to_thread(fn, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/model/rerank/core.py", line 168, in rerank
docs = [
File "/app/anaconda3/lib/python3.11/site-packages/xinference/model/rerank/core.py", line 171, in
日志帮忙贴全,另外试下 0.10.3 还有没有问题。
xinference.api.restful_api 22480 ERROR [address=0.0.0.0:44871, pid=22579] 'float' object is not subscriptable
Traceback (most recent call last):
File "/app/anaconda3/lib/python3.11/site-packages/xinference/api/restful_api.py", line 1067, in rerank
scores = await model.rerank(
^^^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send
return self._process_result_message(result)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 659, in send
result = await self._run_coro(message.message_id, coro)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
return await coro
File "/app/anaconda3/lib/python3.11/site-packages/xoscar/api.py", line 384, in on_receive
return await super().on_receive(message) # type: ignore
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 558, in on_receive
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive
async with self._lock:
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive
result = await result
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/utils.py", line 45, in wrapped
ret = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 79, in wrapped_func
ret = await fn(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 418, in rerank
return await self._call_wrapper(
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 103, in _async_wrapper
return await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 333, in _call_wrapper
ret = await asyncio.to_thread(fn, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^
File "/app/anaconda3/lib/python3.11/site-packages/xinference/model/rerank/core.py", line 168, in rerank
docs = [
File "/app/anaconda3/lib/python3.11/site-packages/xinference/model/rerank/core.py", line 171, in
这个错误是 FlagReranker 的问题,它方法签名写的是返回个 List[float]但实际上返回了个 float:https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/flag_reranker.py#L194
这个问题在xinference 新版已经部分切回 sentence transformers 了,你可以更新版本试试。
0.10.3版本下没有问题
@yaoyasong @codingl2k1 my apologies, but how was this fixed? i just ran into this with version 0.12
i see that on https://github.com/xorbitsai/inference/blob/main/xinference/model/rerank/core.py#L211 you convert the arg to an int, there must be a reason for that, and probably the same requirements is needed to use arg as an array index
so line below might have to be
relevance_score=float(similarity_scores[int(arg)]),
(and also in the else block)
https://github.com/xorbitsai/inference/blob/main/xinference/model/rerank/core.py#L211
@aresnow1 Do you remember why you convert the arg to int? https://github.com/xorbitsai/inference/blob/main/xinference/model/rerank/core.py#L211
@yaoyasong @codingl2k1 my apologies, but how was this fixed? i just ran into this with version 0.12
i see that on https://github.com/xorbitsai/inference/blob/main/xinference/model/rerank/core.py#L211 you convert the
argto an int, there must be a reason for that, and probably the same requirements is needed to use arg as an array indexso line below might have to be
relevance_score=float(similarity_scores[int(arg)]),(and also in the else block)
Which model are you using? Could you provide the traceback?
@codingl2k1 the bge gemma v2 one. but i do not always run into the issue
@aresnow1 @codingl2k1 is it ok/safe to cast the arg to an int?
it is a different cause, i opened https://github.com/xorbitsai/inference/issues/1775