vllm icon indicating copy to clipboard operation
vllm copied to clipboard

Fix #issues/320

Open metacryptom opened this issue 2 years ago • 4 comments

client disconnect never executed cause task schedule stucked

metacryptom avatar Jun 30 '23 17:06 metacryptom

try: async for request_output in results_generator: if await request.is_disconnected():

The await request.is_disconnected is never excueted if something error happed(maybe length over max) ,so the request never quitted which cause the schedule to stuck

metacryptom avatar Jun 30 '23 17:06 metacryptom

#Issue 320

metacryptom avatar Jun 30 '23 17:06 metacryptom

@metacryptom Thanks for your contribution! Please correct me if I'm wrong. If this is merely for fixing the ValueError that happened when the input is too long, #273 is actually fixing it. If this is for fixing for the case when a client disconnects during streaming, I believe FastAPI automatically shut down the coroutine and calls the following background task to abort the request:

https://github.com/vllm-project/vllm/blob/598dc4b79a0078fe14e0c134d0b53b4842a8b227/vllm/entrypoints/api_server.py#L50-L52

If both these two cases can be effectively handled as what I understand, is this PR still necessary?

zhuohan123 avatar Jun 30 '23 23:06 zhuohan123

Not just the case the input is too long, when the request can't be executed and added to swap queue ,the new coming request can't be executed either . I think the main problems here are that when in none stream mode (the code you paste above only works in stream mode), the disconnect check can't code can't be executed before the generator yield some outputs ,this code will cause potential issues ,the long input is just one of them. The better way is to check disconnected on the generator loop .

metacryptom avatar Jul 01 '23 08:07 metacryptom

yes, i find some special tokens decode would also cause stucked. All other requests are blocked.

Ted8000 avatar Jul 14 '23 09:07 Ted8000

Close this PR since we did a major refactor of async_llm_engine.py recently and this issue should have been fixed. Feel free to open a new PR if this issue persists.

zhuohan123 avatar Sep 12 '23 22:09 zhuohan123