langchain
langchain copied to clipboard
How do i add memory to RetrievalQA.from_chain_type? or, how do I add a custom prompt to ConversationalRetrievalChain?
How do i add memory to RetrievalQA.from_chain_type? or, how do I add a custom prompt to ConversationalRetrievalChain?
For the past 2 weeks ive been trying to make a chatbot that can chat over documents (so not in just a semantic search/qa so with memory) but also with a custom prompt. I've tried every combination of all the chains and so far the closest I've gotten is ConversationalRetrievalChain, but without custom prompts, and RetrievalQA.from_chain_type but without memory
^ I'm also struggling to tweak the custom prompt in CHAT_CONVERSATIONAL_REACT_DESCRIPTION
Its not exactly clear if we can extend a class and implement our own, or override the prompts. Do drop us noobs some tips, thanks!
@ThatDevHuddy Maybe this can help you
from langchain.prompts import PromptTemplate
template = """Answer the question in your own words as truthfully as possible from the context given to you.
If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"
Context: {context}
{chat_history}
Human: {question}
Assistant:"""
prompt = PromptTemplate(
input_variables=["context", "chat_history", "question"], template=template
)
# Create the custom chain
if llm is not None and vectordb is not None:
chain = ConversationalRetrievalChain.from_llm(
llm=llm, retriever=vectordb.as_retriever(), memory=memory,
get_chat_history=get_chat_history, return_source_documents=True,
combine_docs_chain_kwargs={'prompt': prompt})
else:
logger.error("LLM or Vector Database not initialized")```
Note: `combine_docs_chain_kwargs={'prompt': prompt})` is the part of key importance. Let me know if it helps
🤖 AI-generated response by Steercode - chat with Langchain codebase Disclaimer: SteerCode Chat may provide inaccurate information about the Langchain codebase.
To add a custom prompt to ConversationalRetrievalChain, you can pass a custom PromptTemplate to the from_llm method when creating the ConversationalRetrievalChain instance.
- Create a custom prompt template:
from langchain.prompt_templates import PromptTemplate
custom_prompt = PromptTemplate(input_variables=["question", "context"], template="{question}\n{context}")
- Pass the custom prompt template when creating the ConversationalRetrievalChain instance:
conversational_chain = ConversationalRetrievalChain.from_llm(llm=my_language_model, retriever=my_retriever, condense_question_prompt=custom_prompt)
Now, you should have a conversational retrieval chain with memory and a custom prompt.
You can continue the conversation at https://steercode.com/
Same issue here
Also very curious!
I got my answer from this thread, hope it helps!
The code example I got working was the following
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")
memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True, output_key='answer')
retriever = your_vector_store.as_retriever()
# Create the multipurpose chain
qachat = ConversationalRetrievalChain.from_llm(
llm=ChatOpenAI(temperature=0),
memory=memory,
retriever=retriever,
return_source_documents=True
)
qachat("Ask your question here...")
Hey guys maybe this could give us a clue. In flowiseai you can add a System prompt and other things to a ConversationalQA Chain
Found something useful here:
https://python.langchain.com/docs/use_cases/question_answering/integrations/openai_functions_retrieval_qa
going down, they make an example of custom prompt with conversational retrieval chain
I got it working for Python, but what about Javascript?
When I try to adapt this to Javascript it doesn't work.
I am able to successfully implement QA chain; however, ConversationalRetrievalQAChain fails to provide the conversation history.
@tmk1221 The js implementation is pretty well described here: https://js.langchain.com/docs/modules/chains/popular/chat_vector_db
Here is how I achieved memory with a custom prompt template for ConversationalRetrievalChain.
### build memory
memory = ConversationBufferMemory(
memory_key="chat_history",
max_len=50,
return_messages=True,
)
prompt_template = '''
You are a Bioinformatics expert with immense knowledge and experience in the field. Your name is Dr. Fanni.
Answer my questions based on your knowledge and our older conversation. Do not make up answers.
If you do not know the answer to a question, just say "I don't know".
Given the following conversation and a follow up question, answer the question.
{chat_history}
question: {question}
'''
PROMPT = PromptTemplate.from_template(
template=prompt_template
)
chain = ConversationalRetrievalChain.from_llm(
chat_model,
retriever,
memory=memory,
condense_question_prompt=PROMPT
)
pp.pprint(chain({'question': q1, 'chat_history': memory.chat_memory.messages}))
Inferred from the documentation.
Memory can be passed to RetrievalQA chain like any other chain.
memory = ConversationBufferMemory(
memory_key="chat_history",
max_len=50,
return_messages=True,
)
chain_type_kwargs = {'prompt': PROMPT}
chain = RetrievalQA.from_chain_type(
llm=chat_model,
chain_type="stuff",
retriever=retriever,
chain_type_kwargs=chain_type_kwargs,
memory=memory
)
The chat history/ stored memory can be viewed with
memory.chat_memory.messages
Here PROMPT is the custom prompt template.
I implemented the chat history with RetrievalQAWithSourcechain, following the below approach.
prompt_template = '''
You are a Bioinformatics expert with immense knowledge and experience in the field.
Answer my questions based on your knowledge and our older conversation. Do not make up answers.
If you do not know the answer to a question, just say "I don't know".
{context}
Given the following conversation and a follow up question, answer the question.
{chat_history}
question: {question}
'''
PROMPT = PromptTemplate(
template=prompt_template, input_variables=["context", "chat_history", "question"]
)
memory = ConversationBufferMemory(
memory_key="chat_history",
max_len=50,
return_messages=True,
output_key='answer'
)
# this time it was required to specify output_key in memory.
chain = RetrievalQAWithSourcesChain.from_chain_type(
llm=chat_model,
chain_type="stuff",
retriever=retriever,
memory=memory,
)
# testing the responses
pp.pprint(chain(PROMPT.format(question=q1, chat_history=memory.chat_memory.messages, context='')))
### Answer
Some famous algorithms in bioinformatics include BLAST (Basic Local Alignment Search Tool), dynamic programming for sequence alignment, and phylogeny reconstruction algorithms......
pp.pprint(chain(PROMPT.format(question=q2, chat_history=memory.chat_memory.messages, context='')))
### Answer
The third algorithm mentioned is phylogeny reconstruction algorithms. Phylogeny reconstruction algorithms are used to infer the evolutionary relationships among a set of organisms ......
The response for q2 was exactly what it should be.
In this approach I passed the chat history in Prompt. I think this approach is not efficient, and may cause the TokenLimit error. Using ConversationBufferWindowMemory can be a solution to prevent this.
Friends got a better way!
This works! Thank you!
@ThatDevHuddy Maybe this can help you
from langchain.prompts import PromptTemplate template = """Answer the question in your own words as truthfully as possible from the context given to you. If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question". If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents" Context: {context} {chat_history} Human: {question} Assistant:""" prompt = PromptTemplate( input_variables=["context", "chat_history", "question"], template=template ) # Create the custom chain if llm is not None and vectordb is not None: chain = ConversationalRetrievalChain.from_llm( llm=llm, retriever=vectordb.as_retriever(), memory=memory, get_chat_history=get_chat_history, return_source_documents=True, combine_docs_chain_kwargs={'prompt': prompt}) else: logger.error("LLM or Vector Database not initialized")``` Note: `combine_docs_chain_kwargs={'prompt': prompt})` is the part of key importance. Let me know if it helps
hi thanks for the answer. Small question, where is get_chat_history
defined?
@ThatDevHuddy Maybe this can help you
from langchain.prompts import PromptTemplate template = """Answer the question in your own words as truthfully as possible from the context given to you. If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question". If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents" Context: {context} {chat_history} Human: {question} Assistant:""" prompt = PromptTemplate( input_variables=["context", "chat_history", "question"], template=template ) # Create the custom chain if llm is not None and vectordb is not None: chain = ConversationalRetrievalChain.from_llm( llm=llm, retriever=vectordb.as_retriever(), memory=memory, get_chat_history=get_chat_history, return_source_documents=True, combine_docs_chain_kwargs={'prompt': prompt}) else: logger.error("LLM or Vector Database not initialized")``` Note: `combine_docs_chain_kwargs={'prompt': prompt})` is the part of key importance. Let me know if it helps
hi, may I please ask where you have defined the get_chat_history
variable?
@damithsenanayake I did not use get_chat_history in my code. It seems to work fine without get_chat_history as long as you include memory=memory. Please correct me if that's not the case.
@damithsenanayake I did not use get_chat_history in my code. It seems to work fine without get_chat_history as long as you include memory=memory. Please correct me if that's not the case.
I believe there's been a change in Langchain API. When memory=memory
is used, I get the following error
pydantic.error_wrappers.ValidationError: 1 validation error for ConversationalRetrievalChain
memory
value is not a valid dict (type=type_error.dict)
I'm on langchain==0.0.352
Edit/Update
I've found out memory arg needs to be a child of BaseChatMemory
. It was not langchain API related. This is working for me now.
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(
memory_key="chat_history",
chat_memory=chat_history, # this is your persistence strategy subclass of `BaseChatMessageHistory`
output_key="answer",
return_messages=True
)
qa = ConversationalRetrievalChain.from_llm(
llm=llm,
memory=memory,
retriever=docsearch.as_retriever(),
return_source_documents=True,
return_generated_question=True,
verbose=True
)
@damithsenanayake, I have not used the updated langchain API yet, but the get_chat_history was a built-in method. And @eliujl is right about memory.
Thanks for all the examples above! I am able to build a RAG that works with pdfs and with customized prompt with the following code:
template = """Answer the question in your own words as truthfully as possible from the context given to you.
If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"
Chat history: {chat_history}
User question: {question}
Assistant: """
def get_context(pdf_docs):
text = ""
for pdf in pdf_docs:
pdf_reader = PdfReader(pdf)
for page in pdf_reader.pages:
text += page.extract_text()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
chunks = text_splitter.split_text(text)
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L12-v2")
vectordb = FAISS.from_texts(texts=chunks, embedding=embeddings)
return vectordb
def get_response(query, chat_history, vectordb):
prompt = PromptTemplate.from_template(template=template)
memory = ConversationBufferMemory(
memory_key="chat_history", max_len=50, return_messages=True, output_key="answer"
)
llm = ChatOpenAI(temperature=0.1)
chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=vectordb.as_retriever(),
memory=memory,
return_source_documents=True,
condense_question_prompt=prompt,
)
return chain({"question": query, "chat_history": chat_history})["answer"]
Hope it helps!
Thanks for all the examples above! I am able to build a RAG that works with pdfs and with customized prompt with the following code:
template = """Answer the question in your own words as truthfully as possible from the context given to you. If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question". If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents" Chat history: {chat_history} User question: {question} Assistant: """ def get_context(pdf_docs): text = "" for pdf in pdf_docs: pdf_reader = PdfReader(pdf) for page in pdf_reader.pages: text += page.extract_text() text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100) chunks = text_splitter.split_text(text) embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L12-v2") vectordb = FAISS.from_texts(texts=chunks, embedding=embeddings) return vectordb def get_response(query, chat_history, vectordb): prompt = PromptTemplate.from_template(template=template) memory = ConversationBufferMemory( memory_key="chat_history", max_len=50, return_messages=True, output_key="answer" ) llm = ChatOpenAI(temperature=0.1) chain = ConversationalRetrievalChain.from_llm( llm=llm, retriever=vectordb.as_retriever(), memory=memory, return_source_documents=True, condense_question_prompt=prompt, ) return chain({"question": query, "chat_history": chat_history})["answer"]
Hope it helps!
Is passing in combine_docs_chain_kwargs={'prompt': prompt}
the same as condense_question_prompt=prompt
when calling ConversationalRetrievalChain.from_llm()
?