langchainjs
langchainjs copied to clipboard
[ConversationalRetrievalQAChain] Reusing LLM for questionGeneratorChain causes unwanted stream data
Problem
If we set streaming:true
for ConversationalRetrievalQAChain.fromLLM
, the question generated from questionGeneratorChain
will be streamed to the frontend.
Expected behavior
We actually only want the stream data from combineDocumentsChain
.
Why does this problem exist
This is because the model
parameter is passed down and reused for both loadQAStuffChain
and questionGeneratorChain
.
See: https://github.com/hwchase17/langchainjs/blob/main/langchain/src/chains/conversational_retrieval_chain.ts#L149
Fix
Allow a custom questionGeneratorLLM
to be passed via the options
field.
const { questionGeneratorTemplate, qaTemplate, questionGeneratorLLM, ...rest } = options;
.......
const questionGeneratorChain = new LLMChain({
prompt: question_generator_prompt,
llm: questionGeneratorLLM || new OpenAIChat({ temperature: 0 }),
});
.......
return instance;
Seems like the same issue is present for ChatVectorDBQAChain
as well.
The immediate fix for this issue is to do the following
// construct your chain as before
const chain = ConversationalRetrievalQAChain.fromLLM(new ChatOpenAI({streaming: true, ...}), ...)
// after creating the chain override the LLM in the inner `questionGeneratorChain`
chain.questionGeneratorChain.llm = new ChatOpenAI()
// use the chain
We're working on a better solution
Hi, @colintoh! I'm helping the LangChain team manage their backlog and I wanted to let you know that we are marking this issue as stale.
From what I understand, the issue you reported was related to the behavior of streaming:true
in ConversationalRetrievalQAChain.fromLLM
. It seems that the question generated from questionGeneratorChain
is also being streamed to the frontend, which is not the expected behavior. There is a proposed fix to allow a custom questionGeneratorLLM
to be passed via the options
field.
I noticed that there is a comment from delivery stating that the same issue is present for ChatVectorDBQAChain
as well. nfcampos suggests a temporary fix by overriding the LLM in the inner questionGeneratorChain
and provides an example code snippet.
Before we proceed, could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.
Thank you for your contribution to LangChain!
With the current version of langchain, separate LLMs can be passed for question generation purpose. Please have a look
const chain = ConversationalRetrievalQAChain.fromLLM(
// Streaming LLM to stream final result.
streamingModel,
(await vectorStore).asRetriever(),
{
qaChainOptions: {
prompt: customPromptTemplate,
type: "stuff",
verbose: true,
},
verbose: true,
memory: new BufferMemory({
returnMessages: true,
// k: 1, //remember the most recent conversation
chatHistory: new ChatMessageHistory(getPastMessages(pastMessages)),
memoryKey: "chat_history", // Must be set to "chat_history"
inputKey: "question", // The key for the input to the chain
outputKey: "text", // The key for the final conversational output of the chain
}),
returnSourceDocuments: true,
questionGeneratorChainOptions: {
// Non streaming LLM for question generation.
llm: nonStreamingModel,
},
}
);
fromLLM checks under the hood if it is provided a separate LLM for generating question (Refrence: https://github.com/hwchase17/langchainjs/blob/525fe837b4e67e5247038e25e8a261740e182a23/langchain/src/chains/conversational_retrieval_chain.ts#L226 ), if it is it will use that provided LLM but if its not it'll switch to the default one which you've passed for streaming the final result.
@nfcampos Could you please help @iPanchalShubham with this issue? They have provided a temporary fix for the behavior of streaming:true
in ConversationalRetrievalQAChain.fromLLM
and have also shared a code snippet. It would be great if you could review their solution and provide any further guidance. Thank you!
Hi, @colintoh! I'm Dosu, and I'm here to help the langchainjs team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, the issue you reported is related to the behavior of streaming:true
in ConversationalRetrievalQAChain.fromLLM
. It seems that when this is set, the question generated from questionGeneratorChain
is also streamed to the frontend, which is not the expected behavior.
There have been some suggestions in the comments to address this issue. nfcampos suggests a temporary fix by overriding the LLM in the inner questionGeneratorChain
and provides an example code snippet. iPanchalShubham mentions that with the current version of langchain, separate LLMs can be passed for question generation.
Before we proceed further, could you please confirm if this issue is still relevant to the latest version of the langchainjs repository? If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the langchainjs project!