FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

加载多个bge-reranker-v2-m3 变慢

Open kksasa opened this issue 1 year ago • 3 comments

考虑多用户使用,加载了多个reranker , 速度却变慢了?这个正常吗

  1. 只开一个reranker. 300个文本,3个request,排队访问,大概各1s uvicorn api_rerank:app --host 0.0.0.0 --workers 1 --port 8004 INFO: Started server process [2821886] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8004 (Press CTRL+C to quit) total rank time = 1.2703969478607178 INFO: 127.0.0.1:46458 - "POST /rerank HTTP/1.1" 200 OK total rank time = 0.9725611209869385 INFO: 127.0.0.1:46470 - "POST /rerank HTTP/1.1" 200 OK total rank time = 0.9950506687164307 INFO: 127.0.0.1:46484 - "POST /rerank HTTP/1.1" 200 OK

  2. 开了3个reranker,同样的300个文本,3个request并发访问,大2.5s uvicorn api_rerank:app --host 0.0.0.0 --workers 3 --port 8004 INFO: Uvicorn running on http://0.0.0.0:8004 (Press CTRL+C to quit) INFO: Started parent process [2820187] INFO: Started server process [2820191] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Started server process [2820189] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Started server process [2820190] INFO: Waiting for application startup. INFO: Application startup complete. total rank time = 2.6671855449676514 INFO: 127.0.0.1:52324 - "POST /rerank HTTP/1.1" 200 OK total rank time = 2.824272394180298 INFO: 127.0.0.1:52334 - "POST /rerank HTTP/1.1" 200 OK total rank time = 2.785203456878662 INFO: 127.0.0.1:52336 - "POST /rerank HTTP/1.1" 200 OK

  3. 开了3个reranker,同样的300个文本,只发1个request,和单开一个reranke差不多 uvicorn api_rerank:app --host 0.0.0.0 --workers 3 --port 8004 INFO: Uvicorn running on http://0.0.0.0:8004 (Press CTRL+C to quit) INFO: Started parent process [2822382] INFO: Started server process [2822385] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Started server process [2822384] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Started server process [2822386] INFO: Waiting for application startup. INFO: Application startup complete. total rank time = 1.1723065376281738 INFO: 127.0.0.1:33876 - "POST /rerank HTTP/1.1" 200 OK

为了排查是否和fastapi有关,把所有代码放一起,发现也是一样的。线程加载多个rerank发现整体慢了。

kksasa avatar Aug 26 '24 02:08 kksasa

  1. INFO: Started server process [2821886] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8004 (Press CTRL+C to quit) total rank time = 1.2703969478607178 INFO: 127.0.0.1:46458 - "POST /rerank HTTP/1.1" 200 OK total rank time = 0.9725611209869385 INFO: 127.0.0.1:46470 - "POST /rerank HTTP/1.1" 200 OK total rank time = 0.9950506687164307

请问你找到是什么原因了吗

EvanSong77 avatar Sep 24 '24 01:09 EvanSong77

没有

kksasa avatar Sep 25 '24 02:09 kksasa

请问有人知道什么原因吗,我发现我串行和并行的时间是一样的,并行的时候只测推理时间的话时间又确实很短,但是两次调用之间会卡很久,最后时间是一样的

xietian0 avatar May 26 '25 03:05 xietian0