langchainjs [ConversationalRetrievalQAChain] Reusing LLM for questionGeneratorChain causes unwanted stream data

Problem If we set streaming:true for ConversationalRetrievalQAChain.fromLLM, the question generated from questionGeneratorChain will be streamed to the frontend.

Expected behavior We actually only want the stream data from combineDocumentsChain.

Why does this problem exist This is because the model parameter is passed down and reused for both loadQAStuffChain and questionGeneratorChain. See: https://github.com/hwchase17/langchainjs/blob/main/langchain/src/chains/conversational_retrieval_chain.ts#L149

Fix Allow a custom questionGeneratorLLM to be passed via the options field.

const { questionGeneratorTemplate, qaTemplate, questionGeneratorLLM, ...rest } = options;
    .......
    const questionGeneratorChain = new LLMChain({
      prompt: question_generator_prompt,
      llm: questionGeneratorLLM || new OpenAIChat({ temperature: 0 }),
    });
    .......
    return instance;

Apr 04 '23 10:04 colintoh

Seems like the same issue is present for ChatVectorDBQAChain as well.

Apr 04 '23 18:04 delivey

The immediate fix for this issue is to do the following

// construct your chain as before
const chain = ConversationalRetrievalQAChain.fromLLM(new ChatOpenAI({streaming: true, ...}), ...)
// after creating the chain override the LLM in the inner `questionGeneratorChain`
chain.questionGeneratorChain.llm = new ChatOpenAI()

// use the chain

We're working on a better solution

May 09 '23 14:05 nfcampos

Hi, @colintoh! I'm helping the LangChain team manage their backlog and I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue you reported was related to the behavior of streaming:true in ConversationalRetrievalQAChain.fromLLM. It seems that the question generated from questionGeneratorChain is also being streamed to the frontend, which is not the expected behavior. There is a proposed fix to allow a custom questionGeneratorLLM to be passed via the options field.

I noticed that there is a comment from delivery stating that the same issue is present for ChatVectorDBQAChain as well. nfcampos suggests a temporary fix by overriding the LLM in the inner questionGeneratorChain and provides an example code snippet.

Before we proceed, could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your contribution to LangChain!

Aug 19 '23 16:08 dosubot[bot]

With the current version of langchain, separate LLMs can be passed for question generation purpose. Please have a look

const chain = ConversationalRetrievalQAChain.fromLLM(
// Streaming LLM to stream final result.
    streamingModel,
    (await vectorStore).asRetriever(),

    {
      qaChainOptions: {
        prompt: customPromptTemplate,
        type: "stuff",
        verbose: true,
      },

      verbose: true,
      memory: new BufferMemory({
        returnMessages: true,

        // k: 1, //remember the most recent  conversation
        chatHistory: new ChatMessageHistory(getPastMessages(pastMessages)),
        memoryKey: "chat_history", // Must be set to "chat_history"
        inputKey: "question", // The key for the input to the chain
        outputKey: "text", // The key for the final conversational output of the chain
      }),
      returnSourceDocuments: true,
      questionGeneratorChainOptions: {
// Non streaming LLM for question generation.
        llm: nonStreamingModel,
      },
    }
  );

fromLLM checks under the hood if it is provided a separate LLM for generating question (Refrence: https://github.com/hwchase17/langchainjs/blob/525fe837b4e67e5247038e25e8a261740e182a23/langchain/src/chains/conversational_retrieval_chain.ts#L226 ), if it is it will use that provided LLM but if its not it'll switch to the default one which you've passed for streaming the final result.

Aug 20 '23 10:08 iPanchalShubham

@nfcampos Could you please help @iPanchalShubham with this issue? They have provided a temporary fix for the behavior of streaming:true in ConversationalRetrievalQAChain.fromLLM and have also shared a code snippet. It would be great if you could review their solution and provide any further guidance. Thank you!

Aug 20 '23 10:08 dosubot[bot]

Hi, @colintoh! I'm Dosu, and I'm here to help the langchainjs team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue you reported is related to the behavior of streaming:true in ConversationalRetrievalQAChain.fromLLM. It seems that when this is set, the question generated from questionGeneratorChain is also streamed to the frontend, which is not the expected behavior.

There have been some suggestions in the comments to address this issue. nfcampos suggests a temporary fix by overriding the LLM in the inner questionGeneratorChain and provides an example code snippet. iPanchalShubham mentions that with the current version of langchain, separate LLMs can be passed for question generation.

Before we proceed further, could you please confirm if this issue is still relevant to the latest version of the langchainjs repository? If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the langchainjs project!

Nov 19 '23 16:11 dosubot[bot]

langchainjs langchainjs copied to clipboard

[ConversationalRetrievalQAChain] Reusing LLM for questionGeneratorChain causes unwanted stream data

langchainjs
langchainjs copied to clipboard