inference BUG xinference部署bge rerank模型经常出现服务错误

Describe the bug

用xinference0.10.2.post1版本部署了bge-reranker-v2-m3模型，但经常出现如下错误，并且经常会出现服务无法响应的问题，需要重启才能解决：

xinference.api.restful_api 22480 ERROR [address=0.0.0.0:44871, pid=22579] 'float' object is not subscriptable Traceback (most recent call last): File "/app/anaconda3/lib/python3.11/site-packages/xinference/api/restful_api.py", line 1067, in rerank scores = await model.rerank( ^^^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) ... File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 79, in wrapped_func ret = await fn(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 418, in rerank return await self._call_wrapper( ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 103, in _async_wrapper return await fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 333, in _call_wrapper ret = await asyncio.to_thread(fn, *args, **kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/asyncio/threads.py", line 25, in to_thread return await loop.run_in_executor(None, func_call) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/model/rerank/core.py", line 168, in rerank docs = [ File "/app/anaconda3/lib/python3.11/site-packages/xinference/model/rerank/core.py", line 171, in relevance_score=float(similarity_scores[arg]), ^^^^^^^^^^^^^^^^^

Apr 30 '24 05:04 yaoyasong

日志帮忙贴全，另外试下 0.10.3 还有没有问题。

Apr 30 '24 07:04 qinxuye

xinference.api.restful_api 22480 ERROR [address=0.0.0.0:44871, pid=22579] 'float' object is not subscriptable Traceback (most recent call last): File "/app/anaconda3/lib/python3.11/site-packages/xinference/api/restful_api.py", line 1067, in rerank scores = await model.rerank( ^^^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 659, in send result = await self._run_coro(message.message_id, coro) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xoscar/backends/pool.py", line 370, in _run_coro return await coro File "/app/anaconda3/lib/python3.11/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 558, in on_receive raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', ^^^^^^^^^^^^^^^^^ File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive result = await result ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 79, in wrapped_func ret = await fn(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 418, in rerank return await self._call_wrapper( ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 103, in _async_wrapper return await fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/core/model.py", line 333, in _call_wrapper ret = await asyncio.to_thread(fn, *args, **kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/asyncio/threads.py", line 25, in to_thread return await loop.run_in_executor(None, func_call) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^ File "/app/anaconda3/lib/python3.11/site-packages/xinference/model/rerank/core.py", line 168, in rerank docs = [ File "/app/anaconda3/lib/python3.11/site-packages/xinference/model/rerank/core.py", line 171, in relevance_score=float(similarity_scores[arg]), ^^^^^^^^^^^^^^^^^

Apr 30 '24 08:04 yaoyasong

这个错误是 FlagReranker 的问题，它方法签名写的是返回个 List[float]但实际上返回了个 float：https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/flag_reranker.py#L194

这个问题在xinference 新版已经部分切回 sentence transformers 了，你可以更新版本试试。

Apr 30 '24 19:04 codingl2k1

0.10.3版本下没有问题

May 07 '24 04:05 yaoyasong

@yaoyasong @codingl2k1 my apologies, but how was this fixed? i just ran into this with version 0.12

i see that on https://github.com/xorbitsai/inference/blob/main/xinference/model/rerank/core.py#L211 you convert the arg to an int, there must be a reason for that, and probably the same requirements is needed to use arg as an array index

so line below might have to be

relevance_score=float(similarity_scores[int(arg)]),

(and also in the else block)

Jun 11 '24 14:06 stdweird

https://github.com/xorbitsai/inference/blob/main/xinference/model/rerank/core.py#L211

@aresnow1 Do you remember why you convert the arg to int? https://github.com/xorbitsai/inference/blob/main/xinference/model/rerank/core.py#L211

Jun 11 '24 20:06 codingl2k1

@yaoyasong @codingl2k1 my apologies, but how was this fixed? i just ran into this with version 0.12

i see that on https://github.com/xorbitsai/inference/blob/main/xinference/model/rerank/core.py#L211 you convert the arg to an int, there must be a reason for that, and probably the same requirements is needed to use arg as an array index

so line below might have to be
relevance_score=float(similarity_scores[int(arg)]),
(and also in the else block)

Which model are you using? Could you provide the traceback?

Jun 11 '24 20:06 codingl2k1

@codingl2k1 the bge gemma v2 one. but i do not always run into the issue

Jun 12 '24 06:06 stdweird

@aresnow1 @codingl2k1 is it ok/safe to cast the arg to an int?

Jul 03 '24 13:07 stdweird

it is a different cause, i opened https://github.com/xorbitsai/inference/issues/1775

Jul 03 '24 15:07 stdweird

inference inference copied to clipboard

BUG xinference部署bge rerank模型经常出现服务错误

Describe the bug

inference
inference copied to clipboard