Langchain-Chatchat icon indicating copy to clipboard operation
Langchain-Chatchat copied to clipboard

请大家重视这个issue!真正使用肯定是多用户并发问答,希望增加此功能!!!

Open cristianohello opened this issue 1 year ago • 1 comments

cristianohello avatar Jun 02 '23 09:06 cristianohello

这得看你有多少显卡

jby20180901 avatar Jun 06 '23 09:06 jby20180901

+1

Helenailse1 avatar Jun 28 '23 06:06 Helenailse1

这得看你有多少显卡

是不是一个显卡无论多少G,单卡情况下同一时间只支持调用一个模型? 没办法通过多进程并行调用模型推理?

tophgg avatar Jun 29 '23 03:06 tophgg

我4张卡,也并行不了。还是串行的

Gy1900 avatar Jul 03 '23 01:07 Gy1900

fschat已经做到了模型的并发,我们框架是在fschat上进行开发的,因此是支持模型并发的,可以使用最新代码体验一下

zRzRzRzRzRzRzR avatar Sep 28 '23 06:09 zRzRzRzRzRzRzR

我也现在也遇到了这样的问题,经过多用户压测,发现性能卡在了程序这一层 虚拟用户20 GPU A100 80G CPU 64C 160G

随机向FastAPI http://127.0.0.1:7861/chat/knowledge_base_chat 知识库提问

20 虚拟用户

  scenarios: (100.00%) 1 scenario, 20 max VUs, 2m30s max duration (incl. graceful stop):
           * breaking: 20 looping VUs for 2m0s (gracefulStop: 30s)

WARN[0060] Request Failed                                error="request timeout"
WARN[0060] Request Failed                                error="request timeout"
WARN[0060] Request Failed                                error="request timeout"
WARN[0060] Request Failed                                error="request timeout"
WARN[0060] Request Failed                                error="request timeout"
WARN[0060] Request Failed                                error="request timeout"
WARN[0060] Request Failed                                error="request timeout"
WARN[0060] Request Failed                                error="request timeout"
WARN[0060] Request Failed                                error="request timeout"
WARN[0073] Request Failed                                error="request timeout"
WARN[0075] Request Failed                                error="request timeout"
WARN[0075] Request Failed                                error="request timeout"
WARN[0084] Request Failed                                error="request timeout"
WARN[0086] Request Failed                                error="request timeout"
WARN[0092] Request Failed                                error="request timeout"
WARN[0093] Request Failed                                error="request timeout"
WARN[0113] Request Failed                                error="request timeout"
WARN[0114] Request Failed                                error="request timeout"
WARN[0115] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0120] Request Failed                                error="request timeout"
WARN[0133] Request Failed                                error="request timeout"
WARN[0135] Request Failed                                error="request timeout"
WARN[0135] Request Failed                                error="request timeout"
WARN[0144] Request Failed                                error="request timeout"
WARN[0146] Request Failed                                error="request timeout"

     ✗ response code was 200
      ↳  26% — ✓ 12 / ✗ 33

     checks.........................: 26.66% ✓ 12       ✗ 33
     data_received..................: 65 kB  434 B/s
     data_sent......................: 30 kB  200 B/s
     http_req_blocked...............: avg=572.98µs min=2.14µs  med=631.43µs max=1.13ms   p(90)=1.01ms   p(95)=1.06ms
     http_req_connecting............: avg=519.36µs min=0s      med=558.45µs max=1.07ms   p(90)=952.76µs p(95)=986.85µs
   ✗ http_req_duration..............: avg=53.5s    min=13.24s  med=59.99s   max=1m0s     p(90)=1m0s     p(95)=1m0s
       { expected_response:true }...: avg=35.65s   min=13.24s  med=32.92s   max=54.81s   p(90)=54.63s   p(95)=54.74s
   ✗ http_req_failed................: 73.33% ✓ 33       ✗ 12
     http_req_receiving.............: avg=53s      min=12.91s  med=59.41s   max=59.98s   p(90)=59.98s   p(95)=59.98s
     http_req_sending...............: avg=62.64µs  min=19.34µs med=44.73µs  max=196.91µs p(90)=130.25µs p(95)=162.7µs
     http_req_tls_handshaking.......: avg=0s       min=0s      med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...............: avg=500.18ms min=16.52ms med=327.3ms  max=1.66s    p(90)=1.34s    p(95)=1.51s
     http_reqs......................: 45     0.299999/s
     iteration_duration.............: avg=53.5s    min=13.24s  med=1m0s     max=1m0s     p(90)=1m0s     p(95)=1m0s
     iterations.....................: 45     0.299999/s
     vus............................: 6      min=6      max=20
     vus_max........................: 20     min=20     max=20


running (2m30.0s), 00/20 VUs, 45 complete and 6 interrupted iterations
breaking ✓ [======================================] 20 VUs  2m0s
ERRO[0151] thresholds on metrics 'http_req_duration, http_req_failed' have been crossed

20 用户服务端异常信息

2023-11-09 19:40:44 | ERROR | asyncio | Task was destroyed but it is pending!
task: <Task pending name='Task-55435' coro=<Queue.get() running at /home/cloud/miniconda3/envs/llm-poc/lib/python3.10/asyncio/queues.py:159> wait_for=<Future pending cb=[Task.task_wakeup()]>>

2023-11-09 19:42:30 | ERROR | stderr | ERROR:    Exception in ASGI application
2023-11-09 19:42:30 | ERROR | stderr | Traceback (most recent call last):
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 34, in read
2023-11-09 19:42:30 | ERROR | stderr |     return await self._stream.receive(max_bytes=max_bytes)
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 1203, in receive
2023-11-09 19:42:30 | ERROR | stderr |     await self._protocol.read_event.wait()
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/asyncio/locks.py", line 213, in wait
2023-11-09 19:42:30 | ERROR | stderr |     await fut
2023-11-09 19:42:30 | ERROR | stderr | asyncio.exceptions.CancelledError
2023-11-09 19:42:30 | ERROR | stderr |
2023-11-09 19:42:30 | ERROR | stderr | During handling of the above exception, another exception occurred:
2023-11-09 19:42:30 | ERROR | stderr |
2023-11-09 19:42:30 | ERROR | stderr | Traceback (most recent call last):
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
2023-11-09 19:42:30 | ERROR | stderr |     yield
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 32, in read
2023-11-09 19:42:30 | ERROR | stderr |     with anyio.fail_after(timeout):
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/anyio/_core/_tasks.py", line 119, in __exit__
2023-11-09 19:42:30 | ERROR | stderr |     raise TimeoutError
2023-11-09 19:42:30 | ERROR | stderr | TimeoutError
2023-11-09 19:42:30 | ERROR | stderr |
2023-11-09 19:42:30 | ERROR | stderr | The above exception was the direct cause of the following exception:
2023-11-09 19:42:30 | ERROR | stderr |
2023-11-09 19:42:30 | ERROR | stderr | Traceback (most recent call last):
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
2023-11-09 19:42:30 | ERROR | stderr |     yield
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
2023-11-09 19:42:30 | ERROR | stderr |     resp = await self._pool.handle_async_request(req)
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 262, in handle_async_request
2023-11-09 19:42:30 | ERROR | stderr |     raise exc
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 245, in handle_async_request
2023-11-09 19:42:30 | ERROR | stderr |     response = await connection.handle_async_request(request)
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_async/connection.py", line 96, in handle_async_request
2023-11-09 19:42:30 | ERROR | stderr |     return await self._connection.handle_async_request(request)
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_async/http11.py", line 121, in handle_async_request
2023-11-09 19:42:30 | ERROR | stderr |     raise exc
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_async/http11.py", line 99, in handle_async_request
2023-11-09 19:42:30 | ERROR | stderr |     ) = await self._receive_response_headers(**kwargs)
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_async/http11.py", line 164, in _receive_response_headers
2023-11-09 19:42:30 | ERROR | stderr |     event = await self._receive_event(timeout=timeout)
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_async/http11.py", line 200, in _receive_event
2023-11-09 19:42:30 | ERROR | stderr |     data = await self._network_stream.read(
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 31, in read
2023-11-09 19:42:30 | ERROR | stderr |     with map_exceptions(exc_map):
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/contextlib.py", line 153, in __exit__
2023-11-09 19:42:30 | ERROR | stderr |     self.gen.throw(typ, value, traceback)
2023-11-09 19:42:30 | ERROR | stderr |   File "/home/cloud/miniconda3/envs/llm-poc/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
2023-11-09 19:42:30 | ERROR | stderr |     raise to_exc(exc) from exc
2023-11-09 19:42:30 | ERROR | stderr | httpcore.ReadTimeout


lemonit-eric-mao avatar Nov 10 '23 02:11 lemonit-eric-mao

文件位置:~\miniconda3\Lib\site-packages\langchain\callbacks\streaming_aiter.py

from typing import Any, AsyncIterator, Dict, List, Literal, Union, cast

from langchain.callbacks.base import AsyncCallbackHandler
from langchain.schema.output import LLMResult

# TODO If used by two LLM runs in parallel this won't work as expected


class AsyncIteratorCallbackHandler(AsyncCallbackHandler):
    """Callback handler that returns an async iterator."""

    queue: asyncio.Queue[str]

    done: asyncio.Event

    @property
    def always_verbose(self) -> bool:
        return True

    def __init__(self) -> None:
        self.queue = asyncio.Queue()
        self.done = asyncio.Event()

目前代码只排查到这一层,现在不确定是Langchain的问题,还是FastChat的问题,应该如何去调整,目前没有调整的思路,还请各位高手多多指教。

lemonit-eric-mao avatar Nov 10 '23 02:11 lemonit-eric-mao

出现同样的问题,第一人多的时候会出现,第二模型运行一段时间之后突然也会出现这种报错,

tommy3266 avatar Dec 01 '23 09:12 tommy3266

文件位置:~\miniconda3\Lib\site-packages\langchain\callbacks\streaming_aiter.py

from typing import Any, AsyncIterator, Dict, List, Literal, Union, cast

from langchain.callbacks.base import AsyncCallbackHandler
from langchain.schema.output import LLMResult

# TODO If used by two LLM runs in parallel this won't work as expected


class AsyncIteratorCallbackHandler(AsyncCallbackHandler):
    """Callback handler that returns an async iterator."""

    queue: asyncio.Queue[str]

    done: asyncio.Event

    @property
    def always_verbose(self) -> bool:
        return True

    def __init__(self) -> None:
        self.queue = asyncio.Queue()
        self.done = asyncio.Event()

目前代码只排查到这一层,现在不确定是Langchain的问题,还是FastChat的问题,应该如何去调整,目前没有调整的思路,还请各位高手多多指教。

看到别的issuehttps://github.com/chatchat-space/Langchain-Chatchat/issues/2684里说10个人左右就是上限,请问您后来有能解决的方案吗,如果我对并发要求不高的话,是不是可以考虑延迟api请求,顺序执行。

Chrosea avatar Mar 04 '24 02:03 Chrosea

出现同样的问题,第一人多的时候会出现,第二模型运行一段时间之后突然也会出现这种报错,

请问你们如何处理的这个问题呀,现在解决了吗

jiusi9 avatar Jun 05 '24 01:06 jiusi9

文件位置:~\miniconda3\Lib\site-packages\langchain\callbacks\streaming_aiter.py

from typing import Any, AsyncIterator, Dict, List, Literal, Union, cast

from langchain.callbacks.base import AsyncCallbackHandler
from langchain.schema.output import LLMResult

# TODO If used by two LLM runs in parallel this won't work as expected


class AsyncIteratorCallbackHandler(AsyncCallbackHandler):
    """Callback handler that returns an async iterator."""

    queue: asyncio.Queue[str]

    done: asyncio.Event

    @property
    def always_verbose(self) -> bool:
        return True

    def __init__(self) -> None:
        self.queue = asyncio.Queue()
        self.done = asyncio.Event()

目前代码只排查到这一层,现在不确定是Langchain的问题,还是FastChat的问题,应该如何去调整,目前没有调整的思路,还请各位高手多多指教。

请问最后是怎么处理的呀

jiusi9 avatar Jun 05 '24 02:06 jiusi9