llm-inference icon indicating copy to clipboard operation
llm-inference copied to clipboard

Api server blocked when one request is in-process

Open HaiHui886 opened this issue 1 year ago • 1 comments

Need more test for this issue

HaiHui886 avatar May 09 '24 01:05 HaiHui886

Refer: https://github.com/ray-project/ray/issues/20169 Looks like it need start api-server with multi-replica

HaiHui886 avatar May 10 '24 03:05 HaiHui886