langchain icon indicating copy to clipboard operation
langchain copied to clipboard

How do i add memory to RetrievalQA.from_chain_type? or, how do I add a custom prompt to ConversationalRetrievalChain?

Open etkinhud-mvla opened this issue 1 year ago • 3 comments

How do i add memory to RetrievalQA.from_chain_type? or, how do I add a custom prompt to ConversationalRetrievalChain?

For the past 2 weeks ive been trying to make a chatbot that can chat over documents (so not in just a semantic search/qa so with memory) but also with a custom prompt. I've tried every combination of all the chains and so far the closest I've gotten is ConversationalRetrievalChain, but without custom prompts, and RetrievalQA.from_chain_type but without memory

etkinhud-mvla avatar May 13 '23 02:05 etkinhud-mvla

^ I'm also struggling to tweak the custom prompt in CHAT_CONVERSATIONAL_REACT_DESCRIPTION

Its not exactly clear if we can extend a class and implement our own, or override the prompts. Do drop us noobs some tips, thanks!

hrishMoola avatar May 13 '23 07:05 hrishMoola

@ThatDevHuddy Maybe this can help you

from langchain.prompts import PromptTemplate

template = """Answer the question in your own words as truthfully as possible from the context given to you.
If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"
Context: {context}


{chat_history}
Human: {question}
Assistant:"""

prompt = PromptTemplate(
    input_variables=["context", "chat_history", "question"], template=template
)

# Create the custom chain
if llm is not None and vectordb is not None:
    chain = ConversationalRetrievalChain.from_llm(
        llm=llm, retriever=vectordb.as_retriever(), memory=memory,
        get_chat_history=get_chat_history, return_source_documents=True,
        combine_docs_chain_kwargs={'prompt': prompt})
else:
    logger.error("LLM or Vector Database not initialized")```
    
   
Note: `combine_docs_chain_kwargs={'prompt': prompt})` is the part of key importance. Let me know if it helps

DhavalThkkar avatar May 15 '23 11:05 DhavalThkkar

🤖 AI-generated response by Steercode - chat with Langchain codebase Disclaimer: SteerCode Chat may provide inaccurate information about the Langchain codebase.

To add a custom prompt to ConversationalRetrievalChain, you can pass a custom PromptTemplate to the from_llm method when creating the ConversationalRetrievalChain instance.

  1. Create a custom prompt template:
from langchain.prompt_templates import PromptTemplate

custom_prompt = PromptTemplate(input_variables=["question", "context"], template="{question}\n{context}")
  1. Pass the custom prompt template when creating the ConversationalRetrievalChain instance:
conversational_chain = ConversationalRetrievalChain.from_llm(llm=my_language_model, retriever=my_retriever, condense_question_prompt=custom_prompt)

Now, you should have a conversational retrieval chain with memory and a custom prompt.

You can continue the conversation at https://steercode.com/

votrumar avatar May 15 '23 11:05 votrumar

Same issue here

pablobots avatar Jun 04 '23 21:06 pablobots

Also very curious!

mr-spaghetti-code avatar Jun 06 '23 18:06 mr-spaghetti-code

I got my answer from this thread, hope it helps!

The code example I got working was the following

llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")

memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True, output_key='answer')
retriever = your_vector_store.as_retriever()

# Create the multipurpose chain
qachat = ConversationalRetrievalChain.from_llm(
    llm=ChatOpenAI(temperature=0),
    memory=memory,
    retriever=retriever, 
    return_source_documents=True
)

qachat("Ask your question here...")

coreation avatar Jul 03 '23 13:07 coreation

Hey guys maybe this could give us a clue. In flowiseai you can add a System prompt and other things to a ConversationalQA Chain image

itsjustmeemman avatar Jul 04 '23 00:07 itsjustmeemman

Found something useful here:

https://python.langchain.com/docs/use_cases/question_answering/integrations/openai_functions_retrieval_qa

going down, they make an example of custom prompt with conversational retrieval chain

BoccheseGiacomo avatar Aug 04 '23 19:08 BoccheseGiacomo

I got it working for Python, but what about Javascript?

When I try to adapt this to Javascript it doesn't work.

I am able to successfully implement QA chain; however, ConversationalRetrievalQAChain fails to provide the conversation history.

tmk1221 avatar Sep 02 '23 20:09 tmk1221

@tmk1221 The js implementation is pretty well described here: https://js.langchain.com/docs/modules/chains/popular/chat_vector_db

MichaelRinger avatar Sep 04 '23 08:09 MichaelRinger

Here is how I achieved memory with a custom prompt template for ConversationalRetrievalChain.

### build memory
memory = ConversationBufferMemory(
                                    memory_key="chat_history",
                                    max_len=50,
                                    return_messages=True,
                                )

prompt_template = '''
You are a Bioinformatics expert with immense knowledge and experience in the field. Your name is Dr. Fanni.
Answer my questions based on your knowledge and our older conversation. Do not make up answers.
If you do not know the answer to a question, just say "I don't know".

Given the following conversation and a follow up question, answer the question.

{chat_history}

question: {question}
'''

PROMPT = PromptTemplate.from_template(
            template=prompt_template
        )


chain = ConversationalRetrievalChain.from_llm(
                                                chat_model,
                                                retriever,
                                                memory=memory,
                                                condense_question_prompt=PROMPT
                                            )

pp.pprint(chain({'question': q1, 'chat_history': memory.chat_memory.messages}))

Inferred from the documentation.

Irfan-Ahmad-byte avatar Oct 17 '23 20:10 Irfan-Ahmad-byte

Memory can be passed to RetrievalQA chain like any other chain.


memory = ConversationBufferMemory(
                                    memory_key="chat_history",
                                    max_len=50,
                                    return_messages=True,
                                )

chain_type_kwargs = {'prompt': PROMPT}

chain = RetrievalQA.from_chain_type(
                                llm=chat_model,
                                chain_type="stuff",
                                retriever=retriever,
                                chain_type_kwargs=chain_type_kwargs,
                                memory=memory
                            )

The chat history/ stored memory can be viewed with

memory.chat_memory.messages

Here PROMPT is the custom prompt template.

Irfan-Ahmad-byte avatar Oct 18 '23 03:10 Irfan-Ahmad-byte

I implemented the chat history with RetrievalQAWithSourcechain, following the below approach.

prompt_template = '''
You are a Bioinformatics expert with immense knowledge and experience in the field.
Answer my questions based on your knowledge and our older conversation. Do not make up answers.
If you do not know the answer to a question, just say "I don't know".

{context}

Given the following conversation and a follow up question, answer the question.

{chat_history}

question: {question}
'''

PROMPT = PromptTemplate(
            template=prompt_template, input_variables=["context", "chat_history", "question"]
        )

memory = ConversationBufferMemory(
                                    memory_key="chat_history",
                                    max_len=50,
                                    return_messages=True,
                                    output_key='answer'
                                )
# this time it was required to specify output_key in memory.

chain = RetrievalQAWithSourcesChain.from_chain_type(
                        llm=chat_model,
                        chain_type="stuff",
                        retriever=retriever,
                        memory=memory,
                    )

# testing the responses
pp.pprint(chain(PROMPT.format(question=q1, chat_history=memory.chat_memory.messages, context='')))

### Answer
Some famous algorithms in bioinformatics include BLAST (Basic Local Alignment Search Tool), dynamic programming for sequence alignment, and phylogeny reconstruction algorithms......

pp.pprint(chain(PROMPT.format(question=q2, chat_history=memory.chat_memory.messages, context='')))

### Answer
The third algorithm mentioned is phylogeny reconstruction algorithms. Phylogeny reconstruction algorithms are used to infer the evolutionary relationships among a set of organisms ......

The response for q2 was exactly what it should be.

In this approach I passed the chat history in Prompt. I think this approach is not efficient, and may cause the TokenLimit error. Using ConversationBufferWindowMemory can be a solution to prevent this.

Friends got a better way!

Irfan-Ahmad-byte avatar Oct 18 '23 06:10 Irfan-Ahmad-byte

This works! Thank you!

@ThatDevHuddy Maybe this can help you

from langchain.prompts import PromptTemplate

template = """Answer the question in your own words as truthfully as possible from the context given to you.
If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"
Context: {context}


{chat_history}
Human: {question}
Assistant:"""

prompt = PromptTemplate(
    input_variables=["context", "chat_history", "question"], template=template
)

# Create the custom chain
if llm is not None and vectordb is not None:
    chain = ConversationalRetrievalChain.from_llm(
        llm=llm, retriever=vectordb.as_retriever(), memory=memory,
        get_chat_history=get_chat_history, return_source_documents=True,
        combine_docs_chain_kwargs={'prompt': prompt})
else:
    logger.error("LLM or Vector Database not initialized")```
    
   
Note: `combine_docs_chain_kwargs={'prompt': prompt})` is the part of key importance. Let me know if it helps

eliujl avatar Dec 07 '23 01:12 eliujl

hi thanks for the answer. Small question, where is get_chat_history defined?

damithsenanayake avatar Dec 14 '23 23:12 damithsenanayake

@ThatDevHuddy Maybe this can help you

from langchain.prompts import PromptTemplate

template = """Answer the question in your own words as truthfully as possible from the context given to you.
If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"
Context: {context}


{chat_history}
Human: {question}
Assistant:"""

prompt = PromptTemplate(
    input_variables=["context", "chat_history", "question"], template=template
)

# Create the custom chain
if llm is not None and vectordb is not None:
    chain = ConversationalRetrievalChain.from_llm(
        llm=llm, retriever=vectordb.as_retriever(), memory=memory,
        get_chat_history=get_chat_history, return_source_documents=True,
        combine_docs_chain_kwargs={'prompt': prompt})
else:
    logger.error("LLM or Vector Database not initialized")```
    
   
Note: `combine_docs_chain_kwargs={'prompt': prompt})` is the part of key importance. Let me know if it helps

hi, may I please ask where you have defined the get_chat_history variable?

damithsenanayake avatar Dec 14 '23 23:12 damithsenanayake

@damithsenanayake I did not use get_chat_history in my code. It seems to work fine without get_chat_history as long as you include memory=memory. Please correct me if that's not the case.

eliujl avatar Dec 16 '23 03:12 eliujl

@damithsenanayake I did not use get_chat_history in my code. It seems to work fine without get_chat_history as long as you include memory=memory. Please correct me if that's not the case.

I believe there's been a change in Langchain API. When memory=memory is used, I get the following error

pydantic.error_wrappers.ValidationError: 1 validation error for ConversationalRetrievalChain
memory
  value is not a valid dict (type=type_error.dict)

I'm on langchain==0.0.352

Edit/Update

I've found out memory arg needs to be a child of BaseChatMemory. It was not langchain API related. This is working for me now.

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",
    chat_memory=chat_history,  # this is your persistence strategy subclass of `BaseChatMessageHistory`
    output_key="answer",
    return_messages=True
)

qa = ConversationalRetrievalChain.from_llm(
    llm=llm,
    memory=memory,
    retriever=docsearch.as_retriever(), 
    return_source_documents=True,
    return_generated_question=True,
    verbose=True
)

geeknam avatar Dec 27 '23 04:12 geeknam

@damithsenanayake, I have not used the updated langchain API yet, but the get_chat_history was a built-in method. And @eliujl is right about memory.

Irfan-Ahmad-byte avatar Dec 28 '23 07:12 Irfan-Ahmad-byte

Thanks for all the examples above! I am able to build a RAG that works with pdfs and with customized prompt with the following code:

template = """Answer the question in your own words as truthfully as possible from the context given to you.
            If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
            If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"

            Chat history: {chat_history}

            User question: {question}
            Assistant: """

def get_context(pdf_docs):
    text = ""
    for pdf in pdf_docs:
        pdf_reader = PdfReader(pdf)
        for page in pdf_reader.pages:
            text += page.extract_text()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
    chunks = text_splitter.split_text(text)
    embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L12-v2")
    vectordb = FAISS.from_texts(texts=chunks, embedding=embeddings)
    return vectordb


def get_response(query, chat_history, vectordb):
    prompt = PromptTemplate.from_template(template=template)
    memory = ConversationBufferMemory(
        memory_key="chat_history", max_len=50, return_messages=True, output_key="answer"
    )
    llm = ChatOpenAI(temperature=0.1)
    chain = ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=vectordb.as_retriever(),
        memory=memory,
        return_source_documents=True,
        condense_question_prompt=prompt,
    )
    return chain({"question": query, "chat_history": chat_history})["answer"]

Hope it helps!

duoduofly666 avatar Mar 04 '24 01:03 duoduofly666

Thanks for all the examples above! I am able to build a RAG that works with pdfs and with customized prompt with the following code:

template = """Answer the question in your own words as truthfully as possible from the context given to you.
            If you do not know the answer to the question, simply respond with "I don't know. Can you ask another question".
            If questions are asked where there is no relevant context available, simply respond with "I don't know. Please ask a question relevant to the documents"

            Chat history: {chat_history}

            User question: {question}
            Assistant: """

def get_context(pdf_docs):
    text = ""
    for pdf in pdf_docs:
        pdf_reader = PdfReader(pdf)
        for page in pdf_reader.pages:
            text += page.extract_text()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
    chunks = text_splitter.split_text(text)
    embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L12-v2")
    vectordb = FAISS.from_texts(texts=chunks, embedding=embeddings)
    return vectordb


def get_response(query, chat_history, vectordb):
    prompt = PromptTemplate.from_template(template=template)
    memory = ConversationBufferMemory(
        memory_key="chat_history", max_len=50, return_messages=True, output_key="answer"
    )
    llm = ChatOpenAI(temperature=0.1)
    chain = ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=vectordb.as_retriever(),
        memory=memory,
        return_source_documents=True,
        condense_question_prompt=prompt,
    )
    return chain({"question": query, "chat_history": chat_history})["answer"]

Hope it helps!

Is passing in combine_docs_chain_kwargs={'prompt': prompt} the same as condense_question_prompt=prompt when calling ConversationalRetrievalChain.from_llm()?

rchen19 avatar Jun 07 '24 00:06 rchen19