langchain icon indicating copy to clipboard operation
langchain copied to clipboard

load_qa_chain with map_rerank by local huggingface model

Open flaviadeutsch opened this issue 1 year ago • 8 comments

I use the huggingface model locally and run the following code:

chain = load_qa_chain(llm=chatglm, chain_type="map_rerank", return_intermediate_steps=True, prompt=PROMPT)
chain({"input_documents": search_docs_Documents, "question": query}, return_only_outputs=True)

The error is as follows:

─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /tmp/ipykernel_274378/983731820.py:2 in <module>                                                 │
│                                                                                                  │
│ [Errno 2] No such file or directory: '/tmp/ipykernel_274378/983731820.py'                        │
│                                                                                                  │
│ /tmp/ipykernel_274378/14951549.py:11 in answer_docs                                              │
│                                                                                                  │
│ [Errno 2] No such file or directory: '/tmp/ipykernel_274378/14951549.py'                         │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/base.py:116 in   │
│ __call__                                                                                         │
│                                                                                                  │
│   113 │   │   │   outputs = self._call(inputs)                                                   │
│   114 │   │   except (KeyboardInterrupt, Exception) as e:                                        │
│   115 │   │   │   self.callback_manager.on_chain_error(e, verbose=self.verbose)                  │
│ ❱ 116 │   │   │   raise e                                                                        │
│   117 │   │   self.callback_manager.on_chain_end(outputs, verbose=self.verbose)                  │
│   118 │   │   return self.prep_outputs(inputs, outputs, return_only_outputs)                     │
│   119                                                                                            │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/base.py:113 in   │
│ __call__                                                                                         │
│                                                                                                  │
│   110 │   │   │   verbose=self.verbose,                                                          │
│   111 │   │   )                                                                                  │
│   112 │   │   try:                                                                               │
│ ❱ 113 │   │   │   outputs = self._call(inputs)                                                   │
│   114 │   │   except (KeyboardInterrupt, Exception) as e:                                        │
│   115 │   │   │   self.callback_manager.on_chain_error(e, verbose=self.verbose)                  │
│   116 │   │   │   raise e                                                                        │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_document │
│ s/base.py:75 in _call                                                                            │
│                                                                                                  │
│    72 │   │   docs = inputs[self.input_key]                                                      │
│    73 │   │   # Other keys are assumed to be needed for LLM prediction                           │
│    74 │   │   other_keys = {k: v for k, v in inputs.items() if k != self.input_key}              │
│ ❱  75 │   │   output, extra_return_dict = self.combine_docs(docs, **other_keys)                  │
│    76 │   │   extra_return_dict[self.output_key] = output                                        │
│    77 │   │   return extra_return_dict                                                           │
│    78                                                                                            │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_document │
│ s/map_rerank.py:97 in combine_docs                                                               │
│                                                                                                  │
│    94 │   │                                                                                      │
│    95 │   │   Combine by mapping first chain over all documents, then reranking the results.     │
│    96 │   │   """                                                                                │
│ ❱  97 │   │   results = self.llm_chain.apply_and_parse(                                          │
│    98 │   │   │   # FYI - this is parallelized and so it is fast.                                │
│    99 │   │   │   [{**{self.document_variable_name: d.page_content}, **kwargs} for d in docs]    │
│   100 │   │   )                                                                                  │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/llm.py:192 in    │
│ apply_and_parse                                                                                  │
│                                                                                                  │
│   189 │   ) -> Sequence[Union[str, List[str], Dict[str, str]]]:                                  │
│   190 │   │   """Call apply and then parse the results."""                                       │
│   191 │   │   result = self.apply(input_list)                                                    │
│ ❱ 192 │   │   return self._parse_result(result)                                                  │
│   193 │                                                                                          │
│   194 │   def _parse_result(                                                                     │
│   195 │   │   self, result: List[Dict[str, str]]                                                 │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/llm.py:198 in    │
│ _parse_result                                                                                    │
│                                                                                                  │
│   195 │   │   self, result: List[Dict[str, str]]                                                 │
│   196 │   ) -> Sequence[Union[str, List[str], Dict[str, str]]]:                                  │
│   197 │   │   if self.prompt.output_parser is not None:                                          │
│ ❱ 198 │   │   │   return [                                                                       │
│   199 │   │   │   │   self.prompt.output_parser.parse(res[self.output_key]) for res in result    │
│   200 │   │   │   ]                                                                              │
│   201 │   │   else:                                                                              │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/llm.py:199 in    │
│ <listcomp>                                                                                       │
│                                                                                                  │
│   196 │   ) -> Sequence[Union[str, List[str], Dict[str, str]]]:                                  │
│   197 │   │   if self.prompt.output_parser is not None:                                          │
│   198 │   │   │   return [                                                                       │
│ ❱ 199 │   │   │   │   self.prompt.output_parser.parse(res[self.output_key]) for res in result    │
│   200 │   │   │   ]                                                                              │
│   201 │   │   else:                                                                              │
│   202 │   │   │   return result                                                                  │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/output_parsers/regex.py │
│ :28 in parse                                                                                     │
│                                                                                                  │
│   25 │   │   │   return {key: match.group(i + 1) for i, key in enumerate(self.output_keys)}      │
│   26 │   │   else:                                                                               │
│   27 │   │   │   if self.default_output_key is None:                                             │
│ ❱ 28 │   │   │   │   raise ValueError(f"Could not parse output: {text}")                         │
│   29 │   │   │   else:                                                                           │
│   30 │   │   │   │   return {                                                                    │
│   31 │   │   │   │   │   key: text if key == self.default_output_key else ""                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Could not parse output: 

flaviadeutsch avatar May 02 '23 11:05 flaviadeutsch

I wanted to share that I am also encountering the same issue with the load_qa_chain function when using the map_rerank parameter with a local HuggingFace model. Waiting for any fix from developers

aravind-selvam avatar May 07 '23 07:05 aravind-selvam

Experiencing the same issue, local huggingface embedding model used is 'entence-transformers/all-mpnet-base-v2' base model is Dolly

cmazzoni87 avatar May 15 '23 17:05 cmazzoni87

same for me cmazzoni87

keanduffey avatar May 16 '23 00:05 keanduffey

@keanduffey its exactly the same lines as the ones from @flaviadeutsch

cmazzoni87 avatar May 16 '23 04:05 cmazzoni87

I also facing the exact same issue when using load_qa_with_sources_chain with map_rerank by openai model. do let me know , if any knows how to resolve it.

Raji635 avatar Jun 05 '23 08:06 Raji635

Having the same issue too... Tried Outputparser from Langchain but still was not able to resolve the issue

andysingal avatar Jun 11 '23 13:06 andysingal

I can run the load_qa_chain with map_rerank on google colab but I failed on my local jupyter notebook.

The error msg: ValueError: Could not parse output: AI assistants can help provide personalized offerings and tailored messaging to customers, enrich service interactions, and predict customer needs based on profile data. Score: 100

Does anyone know the solution?

anggoro-yn avatar Jun 15 '23 16:06 anggoro-yn

With map_reduce, it work perfectly. SOmehow it fails on map_rerank.

anggoro-yn avatar Jun 15 '23 16:06 anggoro-yn

Recently got a similar "Could not parse output" error trying to implement an agent again using the Dolly v2 LLM. After reading up on that issue it seems like maybe this is just related to lower power LLMs that cannot produce the required text format to match the expected prompting, I'm wondering if that's all that's going on with rerank too. Although I would have thought @Raji635 using OpenAI's model wouldn't have had the same issue though if that was all it is.

keanduffey avatar Jun 18 '23 14:06 keanduffey

Can confirm same issue with Falcon-7b-instruct on Jupyter Notebook

Object-Oriented101 avatar Jul 01 '23 21:07 Object-Oriented101

Happening with Flan-T5-large too

rjtmehta99 avatar Aug 30 '23 09:08 rjtmehta99

Hi, @flaviadeutsch! I'm Dosu, and I'm here to help the LangChain team manage our backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, the issue you reported is related to the code trying to access a file that doesn't exist, resulting in a No such file or directory error. It seems that several other users, including aravind-selvam, cmazzoni87, and Raji635, have also encountered the same issue when using different models. Some users have attempted to resolve the issue by using Outputparser from Langchain, but unfortunately, the issue persists. Additionally, users like anggoro-yn and keanduffey have shared their experiences and possible explanations for the error.

Before we proceed, I wanted to confirm if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding, and please don't hesitate to reach out if you have any further questions or concerns.

Best regards, Dosu

dosubot[bot] avatar Nov 29 '23 16:11 dosubot[bot]