langchain
langchain copied to clipboard
ValueError: invalid literal for int() with base 10: '0<|im_end|>' use it will throw an exception
System Info
macos
Who can help?
No response
Information
- [ ] The official example notebooks/scripts
- [ ] My own modified scripts
Related Components
- [x] LLMs/Chat Models
- [ ] Embedding Models
- [ ] Prompts / Prompt Templates / Prompt Selectors
- [x] Output Parsers
- [ ] Document Loaders
- [ ] Vector Stores / Retrievers
- [ ] Memory
- [ ] Agents / Agent Executors
- [ ] Tools / Toolkits
- [ ] Chains
- [ ] Callbacks/Tracing
- [ ] Async
Reproduction
llm = AzureOpenAI(deployment_name=deployment, model_name="text-davinci-003", temperature=0, max_tokens=500) llm_chain = load_qa_chain(llm, verbose=True,chain_type="map_rerank")
and when i run ch = llm_chain.run(input_documents=context, question=question)
it will throw an exception zip(typed_results, docs), key=lambda x: -int(x[0][self.rank_key]) ValueError: invalid literal for int() with base 10: '0<|im_end|>'
Expected behavior
this is my llm function
llm = AzureOpenAI(deployment_name=deployment, model_name="text-davinci-003", temperature=0, max_tokens=500) llm_chain = load_qa_chain(llm, verbose=True,chain_type="map_rerank")
and when i run ch = llm_chain.run(input_documents=context, question=question)
it will throw an exception zip(typed_results, docs), key=lambda x: -int(x[0][self.rank_key]) ValueError: invalid literal for int() with base 10: '0<|im_end|>'
and when i change the chain_type ="stuff" it worked. but currently, i want to use map_rerank. any one can help me ?
Generated by a 🤖
Answer
The issue seems to be related to the parsing of the output from the language model. The output contains a special token '
Sources
This response is meant to be helpful and save users time. It is not meant to be a precise solution.
Please leave a 👍 if this is helpful and 👎 if it is irrelevant
this tag is responsed by AzureOpenAI.
this is the document of Auzre
https://learn.microsoft.com/en-us/azure/cognitive-services/openai/chatgpt-quickstart?tabs=command-line&pivots=programming-language-studio
it happened we used map_rerank. i think this is a common issue.
@zhaoxin-jia-tfs @devstein any update on this?
I'm also experiencing the same issue when I use other chain types, besides 'stuff'. Again, only happens with non-stuff chain types. It would be helpful if we get some attention on this, as it appears to be a common issue. @hwchase17 @eyurtsev @nfcampos
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/langchain/chains/combine_documents/map_rerank.py:192, in MapRerankDocumentsChain._process_results.
ValueError: invalid literal for int() with base 10:
For some days now I've been trying to solve this issue when using map_rerank. I can't retrieve the confidence score for the answer. It looks like it's trying to convert the string to a integer, but I get an error. I couldn't find a way to give me the pure string.
So I tried to use RegexParser with a custom prompt as shown in the documentation, but I still get the same error. Here is the example:
from langchain.output_parsers import RegexParser
output_parser = RegexParser( regex=r"(.?)\nScore: (.)", output_keys=["answer", "score"], )
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
In addition to giving an answer, also return a score of how fully it answered the user's question. This should be in the following format:
Question: [question here] Helpful Answer In Italian: [answer here] Score: [score between 0 and 100]
Begin!
Context:
{context}
Question: {question} Helpful Answer In Italian:"""
PROMPT = PromptTemplate( template=prompt_template, input_variables=["context", "question"], output_parser=output_parser, )
chain = load_qa_chain(OpenAI(temperature=0), chain_type="map_rerank", return_intermediate_steps=True, prompt=PROMPT) query = "What did the president say about Justice Breyer" chain({"input_documents": docs, "question": query}, return_only_outputs=True)
https://python.langchain.com/docs/use_cases/question_answering/how_to/question_answering
Hi, @zhaoxin-jia-tfs,
I'm helping the LangChain team manage their backlog and am marking this issue as stale. The issue you raised involves a ValueError when using the "map_rerank" chain_type in the AzureOpenAI function. Other users have also reported experiencing the same issue, and there has been some discussion around a potential explanation related to the parsing of the output from the language model. As of now, the issue remains unresolved and is awaiting further updates from the maintainers.
Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, kindly let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and cooperation.