langchain icon indicating copy to clipboard operation
langchain copied to clipboard

ConversationalRetrievalChain doesn't work with memory

Open DhavalThkkar opened this issue 1 year ago • 5 comments

System Info

langchain version==0.0.169
python=3.10.10
platform=dev_containers

The code given below is not able to utilise memory for answering questions with references

Who can help?

@hwchase17 @agola11

Information

  • [ ] The official example notebooks/scripts
  • [X] My own modified scripts

Related Components

  • [X] LLMs/Chat Models
  • [ ] Embedding Models
  • [ ] Prompts / Prompt Templates / Prompt Selectors
  • [ ] Output Parsers
  • [ ] Document Loaders
  • [X] Vector Stores / Retrievers
  • [X] Memory
  • [ ] Agents / Agent Executors
  • [ ] Tools / Toolkits
  • [X] Chains
  • [ ] Callbacks/Tracing
  • [ ] Async

Reproduction

Use the following code with the necessary changes on your end to replicate:

from dotenv import load_dotenv, find_dotenv
from qdrant_client import QdrantClient
from langchain.chains import ConversationalRetrievalChain
from langchain.chat_models import ChatOpenAI
from langchain.vectorstores import Qdrant
from langchain.embeddings import OpenAIEmbeddings
from langchain.memory import ConversationBufferMemory, RedisChatMessageHistory

import os
from loguru import logger

import redis

# Load environment variables from .env file
load_dotenv(find_dotenv("../app/.env"))
url = os.environ.get("QDRANT_URL")
collection_name = os.environ.get("QDRANT_COLLECTION_NAME")
openai_api_key = os.environ.get("OPENAI_API_KEY")
redis_host = os.environ.get("REDIS_HOST")
redis_port = os.environ.get("REDIS_PORT")

# Initialize Qdrant client and vector database
if url is not None and collection_name is not None:
    client = QdrantClient(url=url, prefer_grpc=True)
    embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
    vectordb = Qdrant(client, collection_name, embeddings.embed_query)
else:
    logger.error("Qdrant URL or Collection Name not set in environment variables")

# Initialize the LLM
if openai_api_key is not None:
    llm = ChatOpenAI(openai_api_key=openai_api_key, temperature=0, model_name="gpt-3.5-turbo")
else:
    logger.error("OpenAI API key not set in environment variables")

from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)

def get_chat_history(inputs) -> str:
    res = []
    for message in inputs:
        if isinstance(message, dict) and "content" in message:
            res.append(message["content"])
    return "\n".join(res)

from langchain.prompts import PromptTemplate

template = """Answer the question in your own words as truthfully as possible from the context given to you.
If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"
Context: {context}


{chat_history}
Human: {question}
Assistant:"""

prompt = PromptTemplate(
    input_variables=["context", "chat_history", "question"], template=template
)

# Create the custom chain
if llm is not None and vectordb is not None:
    chain = ConversationalRetrievalChain.from_llm(
        llm=llm, retriever=vectordb.as_retriever(), memory=memory,
        get_chat_history=get_chat_history, return_source_documents=True,
        combine_docs_chain_kwargs={'prompt': prompt})
else:
    logger.error("LLM or Vector Database not initialized")

# Initialize Redis connection
if redis_host is not None and redis_port is not None:
    redis_client = redis.Redis(host=redis_host, port=redis_port)
else:
    logger.error("Redis host or port not set in environment variables")

session_id = "sample_id"
# Retrieve chat history for session from Redis
chat_history = redis_client.get(session_id)
if chat_history is None:
    # If chat history does not exist, create a new one
    chat_history = RedisChatMessageHistory(session_id, url=f"redis://{redis_host}:{redis_port}")
else:
    # If chat history exists, deserialize it from Redis
    chat_history = RedisChatMessageHistory.deserialize(chat_history, url=f"redis://{redis_host}:{redis_port}")

# Retrieve answer from chain
chain({"question": "Who is Harry potter?", "chat_history": chat_history.messages})
chain({"question": "What are his qualities?", "chat_history": chat_history.messages})

Expected behavior

What are his qualities? should return Harry Potter's qualities and not I don't know. Please ask a question relevant to the documents.

DhavalThkkar avatar May 15 '23 11:05 DhavalThkkar

Hi @DhavalThkkar I'm facing the same issue, Did you find any solution?

ali-faiz-brainx avatar Jun 06 '23 06:06 ali-faiz-brainx

Same here

zigax1 avatar Jun 09 '23 22:06 zigax1

I found a solution regarding the chat history

You need to pass chat_memory field in ConversationalBufferMemory before passing it to any chain.

Note: I have used MongoDB please see the below code.

mongo_history = MongoDBChatMessageHistory(
            connection_string=os.getenv("MONGO_URI"), 
            session_id=session_id
        )

memory = ConversationBufferMemory(memory_key="chat_history", chat_memory=mongo_history, input_key="question")

chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff",  prompt=prompt, memory=memory)

I have used load_qa_with_sources_chain, But I believe It'll also work with other chains.

PS: You don't need to manage get_chat_message for getting the history messages. ConversationalBufferMemory will do it for you.

ali-faiz-brainx avatar Jun 12 '23 06:06 ali-faiz-brainx

@ali-faiz-brainx so you store the chat memory for a specific collection/index in the mongodb base, and load it to the memory before querying the LLM. Am I right?

zigax1 avatar Jun 15 '23 22:06 zigax1

@zigax1 Yes right. I'm just passing the chat_memory in the ConversationBufferMemory and langchain handles it for you. But If you want to do it yourself, you can also pass the get_chat_message method to ConversationBufferMemory.

ali-faiz-brainx avatar Jun 16 '23 04:06 ali-faiz-brainx

@ali-faiz-brainx In what format are you passing the chat_memory? What is the format of chat_memory, can you give me minimal reproducable exmaple?

zigax1 avatar Jun 19 '23 17:06 zigax1

Sure here is the code for this:

mongo_history = MongoDBChatMessageHistory(
            connection_string="mongodb+srv://<username>:<password>@cluster0.eq70b.mongodb.net", 
            session_id=session_id
        )

llm=ChatOpenAI(temperature=0, model='gpt-3.5-turbo')

memory = ConversationSummaryMemory(
            llm=llm, 
            memory_key="chat_history", 
            chat_memory=mongo_history, 
            input_key="question", 
            # max_token_limit=5
 )


prompt = PromptTemplate(template=template, input_variables=["chat_history", "question", "summaries"])

chain = load_qa_with_sources_chain(
            llm=llm, 
            chain_type="stuff",  
            prompt=prompt,
            memory=memory
 )

chain({"input_documents": docs, "question":query}, return_only_outputs=True)

This will create a message_store in collection under chat_history database in you're MongoDB instance.

ali-faiz-brainx avatar Jun 20 '23 07:06 ali-faiz-brainx

mongo_history

Is there any way to control mongo_history's token limit? I set max_token_limit in the memory but it keep showing "model's maximum context length..." error, when my history grows.

ximnet-huisheng avatar Jun 20 '23 09:06 ximnet-huisheng

@ximnet-huisheng I'm also stuck in the same issue, It takes max_token_limit attribute but I think there is no effect of this attribute.

ali-faiz-brainx avatar Jun 21 '23 06:06 ali-faiz-brainx

Hi, @DhavalThkkar! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue you reported was about the ConversationalRetrievalChain not utilizing memory for answering questions with references. It seems that ali-faiz-brainx and zigax1 also faced the same issue. However, ali-faiz-brainx found a solution by passing the chat_memory field in ConversationalBufferMemory before passing it to any chain. They even provided a code example using MongoDB. There was also a discussion about controlling the token limit in mongo_history.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain repository!

dosubot[bot] avatar Sep 21 '23 16:09 dosubot[bot]