langchain
langchain copied to clipboard
ConversationalRetrievalChain doesn't work with memory
System Info
langchain version==0.0.169
python=3.10.10
platform=dev_containers
The code given below is not able to utilise memory for answering questions with references
Who can help?
@hwchase17 @agola11
Information
- [ ] The official example notebooks/scripts
- [X] My own modified scripts
Related Components
- [X] LLMs/Chat Models
- [ ] Embedding Models
- [ ] Prompts / Prompt Templates / Prompt Selectors
- [ ] Output Parsers
- [ ] Document Loaders
- [X] Vector Stores / Retrievers
- [X] Memory
- [ ] Agents / Agent Executors
- [ ] Tools / Toolkits
- [X] Chains
- [ ] Callbacks/Tracing
- [ ] Async
Reproduction
Use the following code with the necessary changes on your end to replicate:
from dotenv import load_dotenv, find_dotenv
from qdrant_client import QdrantClient
from langchain.chains import ConversationalRetrievalChain
from langchain.chat_models import ChatOpenAI
from langchain.vectorstores import Qdrant
from langchain.embeddings import OpenAIEmbeddings
from langchain.memory import ConversationBufferMemory, RedisChatMessageHistory
import os
from loguru import logger
import redis
# Load environment variables from .env file
load_dotenv(find_dotenv("../app/.env"))
url = os.environ.get("QDRANT_URL")
collection_name = os.environ.get("QDRANT_COLLECTION_NAME")
openai_api_key = os.environ.get("OPENAI_API_KEY")
redis_host = os.environ.get("REDIS_HOST")
redis_port = os.environ.get("REDIS_PORT")
# Initialize Qdrant client and vector database
if url is not None and collection_name is not None:
client = QdrantClient(url=url, prefer_grpc=True)
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
vectordb = Qdrant(client, collection_name, embeddings.embed_query)
else:
logger.error("Qdrant URL or Collection Name not set in environment variables")
# Initialize the LLM
if openai_api_key is not None:
llm = ChatOpenAI(openai_api_key=openai_api_key, temperature=0, model_name="gpt-3.5-turbo")
else:
logger.error("OpenAI API key not set in environment variables")
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True,
output_key="answer"
)
def get_chat_history(inputs) -> str:
res = []
for message in inputs:
if isinstance(message, dict) and "content" in message:
res.append(message["content"])
return "\n".join(res)
from langchain.prompts import PromptTemplate
template = """Answer the question in your own words as truthfully as possible from the context given to you.
If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"
Context: {context}
{chat_history}
Human: {question}
Assistant:"""
prompt = PromptTemplate(
input_variables=["context", "chat_history", "question"], template=template
)
# Create the custom chain
if llm is not None and vectordb is not None:
chain = ConversationalRetrievalChain.from_llm(
llm=llm, retriever=vectordb.as_retriever(), memory=memory,
get_chat_history=get_chat_history, return_source_documents=True,
combine_docs_chain_kwargs={'prompt': prompt})
else:
logger.error("LLM or Vector Database not initialized")
# Initialize Redis connection
if redis_host is not None and redis_port is not None:
redis_client = redis.Redis(host=redis_host, port=redis_port)
else:
logger.error("Redis host or port not set in environment variables")
session_id = "sample_id"
# Retrieve chat history for session from Redis
chat_history = redis_client.get(session_id)
if chat_history is None:
# If chat history does not exist, create a new one
chat_history = RedisChatMessageHistory(session_id, url=f"redis://{redis_host}:{redis_port}")
else:
# If chat history exists, deserialize it from Redis
chat_history = RedisChatMessageHistory.deserialize(chat_history, url=f"redis://{redis_host}:{redis_port}")
# Retrieve answer from chain
chain({"question": "Who is Harry potter?", "chat_history": chat_history.messages})
chain({"question": "What are his qualities?", "chat_history": chat_history.messages})
Expected behavior
What are his qualities?
should return Harry Potter's qualities and not I don't know. Please ask a question relevant to the documents.
Hi @DhavalThkkar I'm facing the same issue, Did you find any solution?
Same here
I found a solution regarding the chat history
You need to pass chat_memory
field in ConversationalBufferMemory
before passing it to any chain.
Note: I have used MongoDB please see the below code.
mongo_history = MongoDBChatMessageHistory(
connection_string=os.getenv("MONGO_URI"),
session_id=session_id
)
memory = ConversationBufferMemory(memory_key="chat_history", chat_memory=mongo_history, input_key="question")
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff", prompt=prompt, memory=memory)
I have used load_qa_with_sources_chain
, But I believe It'll also work with other chains.
PS: You don't need to manage get_chat_message
for getting the history messages. ConversationalBufferMemory
will do it for you.
@ali-faiz-brainx so you store the chat memory for a specific collection/index in the mongodb base, and load it to the memory before querying the LLM. Am I right?
@zigax1 Yes right. I'm just passing the chat_memory
in the ConversationBufferMemory
and langchain handles it for you. But If you want to do it yourself, you can also pass the get_chat_message
method to ConversationBufferMemory
.
@ali-faiz-brainx In what format are you passing the chat_memory? What is the format of chat_memory, can you give me minimal reproducable exmaple?
Sure here is the code for this:
mongo_history = MongoDBChatMessageHistory(
connection_string="mongodb+srv://<username>:<password>@cluster0.eq70b.mongodb.net",
session_id=session_id
)
llm=ChatOpenAI(temperature=0, model='gpt-3.5-turbo')
memory = ConversationSummaryMemory(
llm=llm,
memory_key="chat_history",
chat_memory=mongo_history,
input_key="question",
# max_token_limit=5
)
prompt = PromptTemplate(template=template, input_variables=["chat_history", "question", "summaries"])
chain = load_qa_with_sources_chain(
llm=llm,
chain_type="stuff",
prompt=prompt,
memory=memory
)
chain({"input_documents": docs, "question":query}, return_only_outputs=True)
This will create a message_store in collection under chat_history database in you're MongoDB instance.
mongo_history
Is there any way to control mongo_history's token limit? I set max_token_limit in the memory but it keep showing "model's maximum context length..." error, when my history grows.
@ximnet-huisheng I'm also stuck in the same issue, It takes max_token_limit
attribute but I think there is no effect of this attribute.
Hi, @DhavalThkkar! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, the issue you reported was about the ConversationalRetrievalChain
not utilizing memory for answering questions with references. It seems that ali-faiz-brainx and zigax1 also faced the same issue. However, ali-faiz-brainx found a solution by passing the chat_memory
field in ConversationalBufferMemory
before passing it to any chain. They even provided a code example using MongoDB. There was also a discussion about controlling the token limit in mongo_history
.
Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your contribution to the LangChain repository!