chat-langchain Can't display LangChain streaming response in Streamlit (python app framework)

I'm trying to display the streaming output from ChatGPT api to Streamlit (a python web app framework). The goal is to make it feel like the bot is typing back to us. However, langchain.chat_models.ChatOpenAI does not output a generator that we can use in Streamlit. Is there anyway I can make this effect in Streamlit using LangChain?

This is what I tried with LangChain. The command llm_ChatOpenAI(messages) in the code snippet below does streaming effect in the background and python console, but not in Streamlit UI.

import streamlit as st
from langchain.chat_models import ChatOpenAI
from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.schema import HumanMessage

OPENAI_API_KEY = 'XXX'
model_name = "gpt-4-0314"
user_text = "Tell me about Seattle in 10 words."

llm_ChatOpenAI = ChatOpenAI(
    streaming=True, 
    verbose=True,
    temperature=0.0,
    model=model_name,
    openai_api_key=OPENAI_API_KEY,
    callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), 
    )

messages = [
    HumanMessage(content=user_text)
]

# [Doesn't work] Looping over the response for "streaming effect"
for resp in llm_ChatOpenAI(messages):
    st.write(resp)

Since the code above does not work, I have to use the plain vanilla openai, which is a generator, without LangChain wrapper. The code below will display streaming effect with Streamlit:

import openai
openai.api_key = OPENAI_API_KEY
llm_direct = openai.ChatCompletion.create(
            model=model_name, 
            messages=[{"role": "user", "content": user_text}],
            temperature=0.0,
            max_tokens=50,
            stream = True,
        )

tokens = []
# Works well -- Looping over the response for "streaming effect"
for resp in llm_direct:
    if resp.get("choices") and resp["choices"][0].get("delta") and resp["choices"][0]["delta"].get("content"):
        tokens.append( resp["choices"][0]["delta"]["content"] )
        result = "".join(tokens)
        st.write(result)

I attached the streamlit code that you can run and experiment below:

save the file below as main.py
install dependencies pip install openai streamlit
run the code using streamlit run main.py

# main.py

OPENAI_API_KEY = 'YOUR API KEY'
user_text = "Tell me about Seattle in 10 words."

def render():
    import streamlit as st
    st.set_page_config(layout="wide")
    st.write("Welcome to GPT Applications!")

    model_name = st.radio("Choose a model", ["text-davinci-003", "gpt-4-0314", "gpt-3.5-turbo"])
    api_choice = st.radio("Choose an API", ["openai.ChatCompletion", "langchain.llms.OpenAI", "langchain.chat_models.ChatOpenAI"])
    res_box = st.empty()
    
    if st.button("Run", type='primary'):
        if api_choice == "openai.ChatCompletion":
            import openai
            openai.api_key = OPENAI_API_KEY
            llm_direct = openai.ChatCompletion.create(
                        model=model_name, 
                        messages=[{"role": "user", "content": user_text}],
                        temperature=0.0,
                        max_tokens=50,
                        stream = True,
                    )
            
            tokens = []
            # Works well -- Looping over the response for "streaming effect"
            for resp in llm_direct:
                if resp.get("choices") and resp["choices"][0].get("delta") and resp["choices"][0]["delta"].get("content"):
                    tokens.append( resp["choices"][0]["delta"]["content"] )
                    result = "".join(tokens)
                    res_box.write(result) 


        elif api_choice == "langchain.llms.OpenAI":
            from langchain.llms import OpenAI

            llm_OpenAI = OpenAI(
                streaming=True, 
                verbose=True, 
                temperature=0.0,
                model_name=model_name,
                openai_api_key=OPENAI_API_KEY
            )

            response = llm_OpenAI.stream(user_text)  # returns a generator. However, it doesn't work with model_name='gpt-4'
            tokens = []
            for resp in response:
                tokens.append(resp["choices"][0]["text"])
                result = "".join(tokens)
                res_box.write(result)


        elif api_choice == "langchain.chat_models.ChatOpenAI":
            from langchain.chat_models import ChatOpenAI
            from langchain.callbacks.base import CallbackManager
            from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
            from langchain.schema import HumanMessage

            llm_ChatOpenAI = ChatOpenAI(
                streaming=True, 
                verbose=True,
                temperature=0.0,
                model=model_name,
                openai_api_key=OPENAI_API_KEY,
                callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), 
                )

            messages = [
                HumanMessage(content=user_text)
            ]

            # [Doesn't work] Looping over the response for "streaming effect"
            for resp in llm_ChatOpenAI(messages):
                res_box.write(resp)


if __name__ == '__main__':
    render()

Mar 31 '23 14:03 kittipatkampa

I Solved it!

from langchain.callbacks.base import BaseCallbackHandler
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
import streamlit as st

class StreamHandler(BaseCallbackHandler):
    def __init__(self, container, initial_text=""):
        self.container = container
        self.text=initial_text
    def on_llm_new_token(self, token: str, **kwargs) -> None:
        # "/" is a marker to show difference 
        # you don't need it 
        self.text+=token+"/" 
        self.container.markdown(self.text) 

query=st.text_input("input your query",value="Tell me a joke")
ask_button=st.button("ask") 

st.markdown("### streaming box")
# here is the key, setup a empty container first
chat_box=st.empty() 
stream_handler = StreamHandler(chat_box)
chat = ChatOpenAI(max_tokens=25, streaming=True, callbacks=[stream_handler])

st.markdown("### together box")  

if query and ask_button: 
    response = chat([HumanMessage(content=query)])    
    llm_response = response.content  
    st.markdown(llm_response)

Make a custom handler, pass a streamlit container to it, and write markdown inside that contrainer

May 22 '23 18:05 goldengrape

Could you make this work in streamlit-chat?

May 25 '23 12:05 schwarbf

@schwarbf

This is still a bit difficult for me, I am still learning. I am more used to this display layout:

human input box
AI current response box
All chat logs in chronological order

you can see the demo: https://advisors-alliance.streamlit.app/ (It is still only in Chinese, I will add support for various languages later, after all, it is not difficult to achieve multiple languages with the help of ChatGPT.)

BTW, "stream" text to speak can also support now: https://gist.github.com/goldengrape/84ce3624fd5be8bc14f9117c3e6ef81a It's a real pain to have to automate the voice on streamlit cloud.

May 25 '23 18:05 goldengrape

This came up on Twitter, here's an example using the new Streamlit native Chat UI with the custom handler provided above

https://langchain-streaming-example.streamlit.app/

https://github.com/langchain-ai/streamlit-agent/blob/main/streamlit_agent/basic_streaming.py

We just shipped a native integration for langchain agents, won't yet support this basic use case but will look at how to add it. Hope that helps! https://python.langchain.com/docs/modules/callbacks/integrations/streamlit

Jun 30 '23 21:06 sfc-gh-jcarroll

@sfc-gh-jcarroll Great additions. Would be awesome if RetrievalQA and ConversationalRetrievalChain would get compatible as well.

Jul 03 '23 12:07 jmtatsch

Additional examples have been added to https://github.com/langchain-ai/streamlit-agent including Retrieval Chain (Document upload), SQL agent, Pandas agent. Enjoy!

Jul 20 '23 22:07 sfc-gh-jcarroll

@sfc-gh-jcarroll god bless guys! I was cracking my head over 2 hours to stream it over streamlit, I use the retrieval chain example and it solved issue!

Jul 27 '23 01:07 hikmet-demir

Hey all,

Wanted to share that there’s a new command, st.write_stream, out in the latest 1.31.0 release to conveniently handle generators and streamed responses for your chat apps. Check it out in the docs. 🤩

Thanks again for sharing your input to help shape the roadmap and make Streamlit better! Feel free to let others know about this new update and if you build something cool with it, let us know on the Forum!

Feb 12 '24 17:02 sfc-gh-jesmith

chat-langchain chat-langchain copied to clipboard

Can't display LangChain streaming response in Streamlit (python app framework)

chat-langchain
chat-langchain copied to clipboard