h2ogpt
h2ogpt copied to clipboard
How to Stop Generating through Curl + Sockets
Hello H2OGPT Team,
Many thanks for this wonderful implementation!
As title suggest, I have to find a way to stop generating via Sockets + Curl.
How exactly does the Stop Button works in UI?
Apologies if this is a silly question.
@pseudotensor please guide
Hi, the stop in UI works by stopping the async call, but the thread doing the generation still continues. It's challenging in general to do an actual stop. This is not possible for vLLM or OpenAI or other platforms either. Same with GGUF, AWQ, etc. No way to stop them.
For torch itself, I can hack something like this: https://github.com/h2oai/h2ogpt/issues/298 . But it's not general.
However, I can potentially apply for the streaming case if used and use for all (or most) model types.
However, I can potentially apply for the streaming case if used and use for all (or most) model types.
That would be wonderful and very helpful!
Where can I find documentation of CLI arguments? I'm interested to know what --early_stopping do