h2ogpt How to Stop Generating through Curl + Sockets

How to Stop Generating through Curl + Sockets

Open planetMatrix opened this issue 1 year ago • 3 comments

Hello H2OGPT Team,

Many thanks for this wonderful implementation!

As title suggest, I have to find a way to stop generating via Sockets + Curl.

How exactly does the Stop Button works in UI?

Apologies if this is a silly question.

Jan 15 '24 12:01 planetMatrix

@pseudotensor please guide

Jan 16 '24 03:01 planetMatrix

Hi, the stop in UI works by stopping the async call, but the thread doing the generation still continues. It's challenging in general to do an actual stop. This is not possible for vLLM or OpenAI or other platforms either. Same with GGUF, AWQ, etc. No way to stop them.

For torch itself, I can hack something like this: https://github.com/h2oai/h2ogpt/issues/298 . But it's not general.

However, I can potentially apply for the streaming case if used and use for all (or most) model types.

Jan 16 '24 04:01 pseudotensor

However, I can potentially apply for the streaming case if used and use for all (or most) model types.

That would be wonderful and very helpful!

Where can I find documentation of CLI arguments? I'm interested to know what --early_stopping do

Jan 17 '24 16:01 planetMatrix

h2ogpt h2ogpt copied to clipboard

How to Stop Generating through Curl + Sockets

h2ogpt
h2ogpt copied to clipboard