llm-inference
llm-inference copied to clipboard
Api server blocked when one request is in-process
Need more test for this issue
Refer: https://github.com/ray-project/ray/issues/20169 Looks like it need start api-server with multi-replica