langchain ConversationRetrievalChain with memory

Hello!

I am building an ai assistant, with the help of langchain's ConversationRetrievalChain. I built a FastAPI endpoint where users can ask questions from the ai. I store the previous messages in my db. My code:

    def create_chat_agent():
        llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")
        
        # Data Ingestion
        word_loader = DirectoryLoader(DOCUMENTS_DIRECTORY, glob="*.docx")
        documents = []
        documents.extend(word_loader.load())
        # Chunk and Embeddings
        text_splitter = CharacterTextSplitter(chunk_size=800, chunk_overlap=0)
        documents = text_splitter.split_documents(documents)
        embeddings = OpenAIEmbeddings()
        vectorstore = FAISS.from_documents(documents, embeddings)

        # Initialise Langchain - Conversation Retrieval Chain
        return ConversationalRetrievalChain.from_llm(llm, vectorstore.as_retriever())


    def askAI(cls, prompt: str, id: str):
        qa = cls.create_chat_agent()

        chat_history = []
        previousMessages = UserController.get_previous_messages_by_user_id(id)
        for message in previousMessages:
            messageObject = (message['user'], message['ai'])
            chat_history.append(messageObject)

        response = qa({"question": prompt, "chat_history": chat_history})
        cls.update_previous_messages(userId=id, prompt=prompt, response=response["answer"])

        return response

I always get back an answer and most of the time it is very specific, however sometimes it answers the wrong question. I mean the question I asked a few prompts earlier. I don't know what is wrong in here, can somebody help me? Thank you in advance!!

Apr 26 '23 09:04 pelyhe

Well, how are the messages in your DB ordered? The way you insert the chat history, it could be possible that you inject history in reverse order? Did you look at the chat history for debugging reasons?

Additionally you probably should initialize the vectorstore in a separate function (if the documents are not different for all users); the way I read you code is that you regenerate embeddings for every single user question?

Apr 26 '23 20:04 jphme

@pelyhe did you solve this? I am also using this chain to go over a document QA. And sometimes if I give it a nonsense input e.g 'hhhhjj' the response is the answer to the previous question.

Jun 21 '23 07:06 jasan-s

Hi, @pelyhe! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, the issue you reported was with the ConversationRetrievalChain in the provided code. It seems that sometimes the AI assistant answers the wrong question, even though the correct question was asked earlier. There have been some suggestions from jphme and jasan-s to check the order of messages in the database and initialize the vectorstore in a separate function to address this issue.

Before we proceed, we would like to confirm if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your contribution to the LangChain repository!

Sep 20 '23 16:09 dosubot[bot]

langchain langchain copied to clipboard

ConversationRetrievalChain with memory

langchain
langchain copied to clipboard