langchain map_reduce and refine not working with RetrievalQA chain

System Info

Langchain Version - 0.0.201 Platform - Windows 11 Python - 3.10.11

Who can help?

@hwchase17 @agola11

Information

[ ] The official example notebooks/scripts
[ ] My own modified scripts

Related Components

[ ] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[X] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

load some text documents to a vector store, i used deeplake
load the db
call the function, summarizer(db,"Summarize the mentions of google according to their AI program")(defined in attached file)
run for chain_type as stuff, it will work, for map_reduce it will fail in retrieval QA Bot main.zip

Expected behavior

it should work for all the chain types and give results

Jul 07 '23 13:07 KillerStrike17

Answer generated by a 🤖

Answer

I understand that you're having trouble with the map_reduce and refine functions when working with the RetrievalQA chain in LangChain. You've mentioned that the issue arises when you try to use these functions with certain chain types, specifically "stuff" and "map_reduce".

From my understanding, the problem might be related to the chain_type parameter in the RetrievalQA.from_chain_type() method. The RetrievalQA chain is designed to work with specific chain types that are compatible with its functionality. If "stuff" and "map_reduce" are not among these types, this could be the cause of the failure you're experiencing.

To address this issue, I suggest ensuring that you're using a valid chain type for the RetrievalQA chain. If you're unsure about the valid chain types, I recommend referring to the LangChain documentation or the source code of the RetrievalQA chain.

Here's an example of how you might use the RetrievalQA.from_chain_type() method with a valid chain type:

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.vectorstores import Chroma

# Assuming docsearch is a valid Chroma instance
qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="valid_chain_type", retriever=docsearch.as_retriever())

In this example, replace "valid_chain_type" with the correct chain type for the RetrievalQA chain. If the issue persists, it could be due to other factors, such as the configuration of the RetrievalQA chain or the data you're using.

Please note that this is just one potential solution based on the information provided. If you believe this is a bug that could impact other users, feel free to make a pull request with a proposed fix. If my suggestion contradicts your experience, I recommend upgrading to the latest version of LangChain to ensure you have the most recent updates and bug fixes.

Sources

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

Jul 07 '23 13:07 dosubot[bot]

👎, Still my query isnt answered..

Jul 07 '23 19:07 KillerStrike17

@KillerStrike17 Can you share error logs?

Jul 07 '23 20:07 devstein

(deeplakelangchain) D:\Github\Video-Summarizer-with-LLMs>python main.py D:\Anaconda\envs\deeplakelangchain\lib\site-packages\deeplake\util\check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.8) is available. It's recommended that you update to the latest version using pip install -U deeplake. warnings.warn( This dataset can be visualized in Jupyter Notebook by ds.visualize() or at https://app.activeloop.ai/megatron17/video_summarizer_with_llms

hub://megatron17/video_summarizer_with_llms loaded successfully.

Deep Lake Dataset in hub://megatron17/video_summarizer_with_llms already exists, loading from the storage Dataset(path='hub://megatron17/video_summarizer_with_llms', tensors=['embedding', 'ids', 'metadata', 'text'])

tensor htype shape dtype compression

embedding generic (88, 1536) float32 None ids text (88, 1) str None metadata json (88, 1) str None text text (88, 1) str None Error in sys.excepthook: Traceback (most recent call last): File "D:\Anaconda\envs\deeplakelangchain\lib\site-packages\humbug\report.py", line 540, in _hook self.error_report(error=exception_instance, tags=tags, publish=publish) File "D:\Anaconda\envs\deeplakelangchain\lib\site-packages\humbug\report.py", line 274, in error_report traceback.format_exception( TypeError: format_exception() got an unexpected keyword argument 'etype'

Original exception was: Traceback (most recent call last): File "D:\Github\Video-Summarizer-with-LLMs\main.py", line 27, in print(summarizer(db,"Summarize the mentions of google according to their AI program")) File "D:\Github\Video-Summarizer-with-LLMs\video_summarizer.py", line 19, in summarizer qa = RetrievalQA.from_chain_type(llm=llm, File "D:\Anaconda\envs\deeplakelangchain\lib\site-packages\langchain\chains\retrieval_qa\base.py", line 91, in from_chain_type combine_documents_chain = load_qa_chain( File "D:\Anaconda\envs\deeplakelangchain\lib\site-packages\langchain\chains\question_answering_init_.py", line 238, in load_qa_chain return loader_mapping[chain_type]( File "D:\Anaconda\envs\deeplakelangchain\lib\site-packages\langchain\chains\question_answering_init_.py", line 151, in _load_map_reduce_chain return MapReduceDocumentsChain( File "D:\Anaconda\envs\deeplakelangchain\lib\site-packages\langchain\load\serializable.py", line 61, in init super().init(**kwargs) File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init pydantic.error_wrappers.ValidationError: 1 validation error for MapReduceDocumentsChain prompt extra fields not permitted (type=value_error.extra)

Jul 08 '23 11:07 KillerStrike17

Answer generated by a 🤖

Answer

I understand that you're encountering an issue when using the map_reduce and refine chain types with the RetrievalQA chain in LangChain. The error message suggests that the MapReduceDocumentsChain class is not expecting a prompt argument, which is being passed in your current implementation.

To address this issue, you should modify your code to avoid passing the prompt argument when the chain_type is "map_reduce". Here's how you can do it:

# Assuming llm and vectorstore are already defined
from langchain.chains import RetrievalQA
from langchain.chains.question_answering import load_qa_chain

# Do not pass the prompt argument when chain_type is "map_reduce"
doc_chain = load_qa_chain(llm, chain_type="map_reduce")

chain = RetrievalQA(
    retriever=vectorstore.as_retriever(),
    combine_docs_chain=doc_chain,
)

If you want more control over the parameters, you can load the chain directly and then pass that directly to the RetrievalQA chain with the combine_documents_chain parameter. For example:

from langchain.chains.question_answering import load_qa_chain
qa_chain = load_qa_chain(OpenAI(temperature=0), chain_type="stuff")
qa = RetrievalQA(combine_documents_chain=qa_chain, retriever=docsearch.as_retriever())

Please note that the _load_map_reduce_chain function does not take a prompt argument. Instead, it takes question_prompt, combine_prompt, and collapse_prompt arguments. If you want to customize the prompts used in the MapReduceDocumentsChain, you should pass these arguments to the load_qa_chain function instead of prompt.

This solution should help resolve your issue. However, if you believe this is a bug that could impact other users, feel free to make a pull request with the necessary changes. If the issue persists after trying this solution, it might be helpful to ensure you're using the latest version of LangChain.

Sources

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

Jul 08 '23 12:07 dosubot[bot]

I was able to fix this. you need to look, for each chain type (stuff, refine, map_reduce & map_rerank) for the correct input vars for each prompt.

Check the attached file, there I described the issue in detail. And how figured out the issue looking at the Langchain source code for the original/default prompt templates for each Chain type.

bing_chain_types.md

Aug 04 '23 10:08 ghost

Hi Braun-demv, I tried following your approach for refine. But I still get the following error.

ValidationError: 1 validation error for RefineDocumentsChain
prompt
  extra fields not permitted (type=value_error.extra)

Any clue?

Aug 08 '23 13:08 adityarawat11

Hi Braun-demv, I tried following your approach for refine. But I still get the following error.
ValidationError: 1 validation error for RefineDocumentsChain
prompt
  extra fields not permitted (type=value_error.extra)
Any clue?

Prompt is no parameter in refine I think you need to use refine_prompt as parameter.

Also you need to do two chains like described in file I uploaded

Aug 08 '23 14:08 ghost

Hi, @KillerStrike17! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you were experiencing an issue with the map_reduce and refine functions not working with the RetrievalQA chain in LangChain. You provided system information and steps to reproduce the issue. Another user, braun-demv, provided a fix for the issue and shared a file describing the problem in detail. However, another user, adityarawat11, reported that they are still encountering an error with the refine function.

Before we close this issue, we wanted to check if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

Nov 07 '23 16:11 dosubot[bot]

langchain langchain copied to clipboard

map_reduce and refine not working with RetrievalQA chain

System Info

Who can help?

Information

Related Components

Reproduction

Expected behavior

Answer

Sources

Answer

Sources

langchain
langchain copied to clipboard