langchain Is there any better advice to summarize question from chat history

We got a QA system using ConversationalRetrievalChain, it is awesome, but it can get better performance in the first step: summarize the question from chat history.

The original prompt to condence question:

Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, if the follow up question is already a standalone question, just return the follow up question.

    Chat History:
    {chat_history}
    Follow Up Question: {question}
    Standalone question:

In most times, it goes well, but some times, it is not. Like when it got a greeting input, we may get a question output

Apr 21 '23 08:04 CreationLee

I have gotten rid of this step altogether as this doesn't really model true chat history and often leads to ambiguous standalone questions that do not capture the entire intent of the follow up question. Instead I have done the following :

added {chat_history} to the combine_docs_chain prompt, which is the final prompt sent to the LLM.
for the question_generator I created no-op chain which doesn't do anything but forwards the question to the combine_docs_chain as is. It makes sense for the question generator to be optional

from langchain import LLMChain, PromptTemplate
from langchain.llms import OpenAI


class NoOpLLMChain(LLMChain):
   """No-op LLM chain."""

   def __init__(self):
       """Initialize."""
       super().__init__(llm=OpenAI(), prompt=PromptTemplate(template="", input_variables=[]))

   async def arun(self, question: str, *args, **kwargs) -> str:
       return question

It also has the benefit of saving costs by making fewer requests to the LLM.

Apr 21 '23 16:04 manibatra

@manibatra How did you pass the NoOp chain into the question_generator? I can see that ConversationalRetrievalChain.from_llm() takes a condense_question_prompt parameter, but not the actual chain.

Did you inherit from ConversationalRetrievalChain and override the from_llm() method where you set condense_question_chain = NoOpLLMChain()?

Jun 06 '23 18:06 AlexZhara

@AlexZhara I am initialising the chain as follows :

 qa = ConversationalRetrievalChain(
        retriever=vectorstore_retriever, combine_docs_chain=doc_chain, question_generator=NoOpLLMChain(), callback_manager=manager, memory=memory, get_chat_history=get_chat_history)

Jun 07 '23 11:06 manibatra

@manibatra Can you also please tell us how are you getting:

combine_docs_chain,
callback_manager
memory In the above code. It will be very helpful. thanks in advance.

Jun 08 '23 15:06 pratikthakkar

@manibatra would be super helpful if you can share any guidance on how to achieve no-op chain in the js version

Jun 22 '23 09:06 jasan-s

Is there anyway to pass chat_history as the list of convo which includes turns of convo ["human", "ai", "human", "ai"], so it can be passed to openai as the chat completion api required?

Jul 14 '23 11:07 kesun-ds

@kesun-ds I'd love to know this too. I don't understand why we're trying to condense the conversation history into a stand alone question, which often fails, when the Open AI provides an API that let's us provide the full chat history.

Did you figure it out?

Jul 23 '23 03:07 brett-matson

Hi everyone, I have figured out a workaround for sending in the entire concatenated memory into a ConversationalRetrievalChain and bypassing the question condensing chain. This workaround builds on @manibatra 's answer. You can find my steps here.

Aug 14 '23 13:08 hemanthkrishna1298

@manibatra would be super helpful if you can share any guidance on how to achieve no-op chain in the js version

I already answered it here, please have a look: https://github.com/langchain-ai/langchain/issues/6879#issuecomment-1685277865

Aug 20 '23 13:08 iPanchalShubham

Hi, @CreationLee! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is about improving the performance of summarizing a question from chat history in a QA system using ConversationalRetrievalChain. The current approach struggles with inputs that are greetings instead of questions. User "manibatra" suggested a solution by getting rid of a step and using a no-op chain, and user "hemanthkrishna1298" provided a workaround for sending the entire concatenated memory into a ConversationalRetrievalChain.

Before we close this issue, we wanted to check if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your contribution!

Nov 19 '23 16:11 dosubot[bot]

langchain langchain copied to clipboard

Is there any better advice to summarize question from chat history

langchain
langchain copied to clipboard