localGPT icon indicating copy to clipboard operation
localGPT copied to clipboard

How could we add the streaming support to enhance the output effect?

Open Zephyruswind opened this issue 2 years ago • 8 comments

How could we add the streaming support to enhance the output effect?

Zephyruswind avatar Oct 01 '23 03:10 Zephyruswind

I completely agree. The user experience is not satisfactory when the response is returned all at once. Is it possible to support a streaming approach where we can see the token output? It would greatly enhance the experience. It seems that the native Llama2 can achieve this, but I'm not sure how localGPT based on user data implements it.

wangchao-sh avatar Oct 06 '23 08:10 wangchao-sh

It's easy, just use callback_manager as a parameter for LlamaCPP and not in RetrievalQA and make llm verbose

See : https://github.com/Rufus31415/local-documents-gpt/blob/1f48f6884ae1d1fe474d6aa8e44b83480dfca30b/chat.py#L74C7-L74C7

kwargs = {
#####
    "verbose": True
#####
}
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
llm = LlamaCpp(callback_manager=callback_manager, **kwargs)
llm.streaming= True

Rufus31415 avatar Nov 07 '23 09:11 Rufus31415

Would it be possible with Mistral ? @Rufus31415

DrekoDev avatar Jan 15 '24 18:01 DrekoDev

Would it be possible with Mistral ? @Rufus31415

Yes, try this : https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF

Rufus31415 avatar Jan 22 '24 06:01 Rufus31415

@Rufus31415 , will it hold for streamlit?

Satyam7166-tech avatar Jan 23 '24 06:01 Satyam7166-tech

Would it be possible with Mistral ? @Rufus31415

Yes, try this : https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF

Hello When I run chat.py,I met a problem like this: 1718118360640

What can I do ?

Suiji12 avatar Jun 11 '24 15:06 Suiji12

@Rufus31415, sorry to revive this old thread, but I was wondering the exact same thing. "How to implement streaming" I looked at the code snippet you provided and current source code, but I don't see how where this would go. Can you provide any further details?

sorhtyre avatar Jul 02 '24 15:07 sorhtyre

sorry, I don't develop this repo. Please contact the localGPT developer directly. In the previous comment, I just said that I had played with LlamaCpp and I explain how to simply add streaming in the standard output.

Rufus31415 avatar Jul 03 '24 13:07 Rufus31415