metacryptom

Results 6 comments of metacryptom

![image](https://github.com/vllm-project/vllm/assets/98044045/8e4dab93-e517-4294-abcd-31445be8aab7)

Token indices sequence length is longer than the specified maximum sequence length for this model (2620 > 2048). Running this sequence through the model will result in indexing errors INFO...

And this also make the server resource leak

try: async for request_output in results_generator: if await request.is_disconnected(): The await request.is_disconnected is never excueted if something error happed(maybe length over max) ,so the request never quitted which cause the...

[#Issue 320 ](https://github.com/vllm-project/vllm/issues/320)

Not just the case the input is too long, when the request can't be executed and added to swap queue ,the new coming request can't be executed either . I...