OpenLLM how to stop generation stream?

how to stop generation stream?

Open mkaskov opened this issue 1 year ago • 3 comments

Is there any way to stop generation stream on the model-side if it's no longer needed? For example, the client disconnected or pressed stop.

Dec 19 '23 14:12 mkaskov

You can pass in stop argument on request for the token to be stopped.

Dec 19 '23 19:12 aarnphm

I'm talking about a situation where the stream is already generating and the client disconnects or presses stop button.

Dec 20 '23 07:12 mkaskov

not sure if I understand this, but if the client disconnects, with vLLM backend the request will be cancelled.

Dec 20 '23 18:12 aarnphm