langchain icon indicating copy to clipboard operation
langchain copied to clipboard

llm-math raising an issue

Open cailynyongyong opened this issue 1 year ago • 9 comments

I'm testing out the tutorial code for Agents:

`from langchain.agents import load_tools from langchain.agents import initialize_agent from langchain.agents import AgentType from langchain.llms import OpenAI

llm = OpenAI(temperature=0) tools = load_tools(["serpapi", "llm-math"], llm=llm) agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True) agent.run("What was the high temperature in SF yesterday in Fahrenheit? What is that number raised to the .023 power?")`

And so far it generates the result:

> Entering new AgentExecutor chain... I need to find the temperature first, then use the calculator to raise it to the .023 power. Action: Search Action Input: "High temperature in SF yesterday" Observation: High: 60.8ºf @3:10 PM Low: 48.2ºf @2:05 AM Approx. Thought: I need to convert the temperature to a number Action: Calculator Action Input: 60.8

But raises an issue and doesn't calculate 60.8^.023 raise ValueError(f"unknown format from LLM: {llm_output}") ValueError: unknown format from LLM: This is not a math problem and cannot be solved using the numexpr library.

What's the reason behind this error?

cailynyongyong avatar Apr 18 '23 06:04 cailynyongyong

Hi , the issue #3071 I think it might be due to there are 2 temperatures one is low and other is high hence the llm got confused on which it should perform math operation hence gave this error

kanukolluGVT avatar Apr 18 '23 08:04 kanukolluGVT

hmm.. from the example answer that they give on the documentation

> Entering new AgentExecutor chain... I need to find the temperature first, then use the calculator to raise it to the .023 power. Action: Search Action Input: "High temperature in SF yesterday" Observation: San Francisco Temperature Yesterday. Maximum temperature yesterday: 57 °F (at 1:56 pm) Minimum temperature yesterday: 49 °F (at 1:56 am) Average temperature ... Thought: I now have the temperature, so I can use the calculator to raise it to the .023 power. Action: Calculator Action Input: 57^.023 Observation: Answer: 1.0974509573251117

It works fine when they get two high and low temp values....

cailynyongyong avatar Apr 18 '23 09:04 cailynyongyong

I am having the same issue. The correct answer appears from time to time. I leave it here in case someone could use it for debugging.

I am using these tools and prompts:

agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

agent.run("Who is the current leader of Japan? What is the largest prime number that is smaller that their age? Just say the number.")

Correct:

> Entering new AgentExecutor chain...
 I need to search for who the leader of Japan is and then find the prime number
Action: Search
Action Input: "Current Leader of Japan"
Observation: Fumio Kishida
Thought: I now need to find the largest prime number smaller than their age
Action: Calculator
Action Input: Kishida's age (68)
Observation: Answer: 68
Thought: I now know the final answer
Final Answer: 67

Incorrect:

> Entering new AgentExecutor chain...
 I need to figure out who the leader of Japan is, then use the Calculator to find the largest primer number that is smaller than their age
Action: Search
Action Input: Who is the current leader of Japan?
Observation: Fumio Kishida
Thought: I need to use the Calculator to find the largest primer number that is smaller than Fumio Kishida's age
Action: Calculator
Action Input: Largest primer number smaller that 68
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[/usr/local/lib/python3.9/dist-packages/langchain/chains/llm_math/base.py](https://localhost:8080/#) in _evaluate_expression(self, expression)
     58             output = str(
---> 59                 numexpr.evaluate(
     60                     expression.strip(),

17 frames
<expr> in <module>

TypeError: 'VariableNode' object is not callable

Another incorrect answer:

> Entering new AgentExecutor chain...
 I will need to search for the leader of Japan and then figure out a way to calculate the prime number smaller than their age.
Action: Search
Action Input: Leader of Japan
Observation: Fumio Kishida
Thought: I need to find out their age
Action: Search
Action Input: Age of Fumio Kishida
Observation: 65 years
Thought: I now need to use a calculator to find out the largest prime number smaller than 65
Action: Calculator
Action Input: 65
Observation: Answer: 65
Thought: That was not the answer I was looking for, I need to find the largest prime number smaller than 65
Action: Calculator
Action Input: Largest prime number smaller than 65
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[/usr/local/lib/python3.9/dist-packages/langchain/chains/llm_math/base.py](https://localhost:8080/#) in _evaluate_expression(self, expression)
     58             output = str(
---> 59                 numexpr.evaluate(
     60                     expression.strip(),

17 frames
<expr> in <module>

TypeError: 'VariableNode' object is not callable

Another one:

> Entering new AgentExecutor chain...
 I need to find out who the leader is, as well as their age.
Action: Search
Action Input: 'current leader of Japan'
Observation: Fumio Kishida
Thought: I need to find their age.
Action: Search
Action Input: 'Fumio Kishida age'
Observation: 65 years
Thought: I need to find the largest prime number that is smaller than 65.
Action: Calculator
Action Input: 'prime numbers smaller than 65'
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-42-08418b732b7d>](https://localhost:8080/#) in <cell line: 1>()
----> 1 agent.run("Who is the current leader of Japan? What is the largest prime number that is smaller that their age? Just say the number.")

12 frames
[/usr/local/lib/python3.9/dist-packages/langchain/chains/llm_math/base.py](https://localhost:8080/#) in _process_llm_result(self, llm_output)
     84             answer = "Answer: " + llm_output.split("Answer:")[-1]
     85         else:
---> 86             raise ValueError(f"unknown format from LLM: {llm_output}")
     87         return {self.output_key: answer}
     88 

ValueError: unknown format from LLM: This question does not have a single line mathematical expression that can be executed using Python's numexpr library.

gustavovargas avatar Apr 18 '23 15:04 gustavovargas

Ran into this issue as well. Resolved by changing the temperature from 0 to 0.9.

chasen-bettinger avatar Apr 22 '23 20:04 chasen-bettinger

I'm getting the same issue as well, not sure why or what the solution is. It's almost 100% consistent.

justjoehere avatar May 05 '23 16:05 justjoehere

I also have the same issue:

llm = OpenAI(temperature=0,
             openai_api_key=OPENAI_API_KEY,
             openai_api_base=OPENAI_API_BASE)

buildin_tools = load_tools(["llm-math"], llm=llm)
tools = [PreOwnedHouseHistoryDealsTool()] + buildin_tools
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, max_iterations=5)

Thought: I need to translate this into a human-readable format
Action: Calculator
Action Input: Round off 29077.702127659573 to the nearest integer[2023-05-06 18:01:15,574] ERROR in app: Exception on /d4r_chat/api/v1.0/agent [POST]
Traceback (most recent call last):
  File "D:\SynologyDrive\Drive\desk\d4r-kb-indexer\venv\lib\site-packages\langchain\chains\llm_math\base.py", line 80, in _evaluate_expression    numexpr.evaluate(
  File "D:\SynologyDrive\Drive\desk\d4r-kb-indexer\venv\lib\site-packages\numexpr\necompiler.py", line 817, in evaluate
    _names_cache[expr_key] = getExprNames(ex, context)
  File "D:\SynologyDrive\Drive\desk\d4r-kb-indexer\venv\lib\site-packages\numexpr\necompiler.py", line 704, in getExprNames
    ex = stringToExpression(text, {}, context)
  File "D:\SynologyDrive\Drive\desk\d4r-kb-indexer\venv\lib\site-packages\numexpr\necompiler.py", line 289, in stringToExpression
    ex = eval(c, names)
  File "<expr>", line 1, in <module>
TypeError: 'VariableNode' object is not callable```

itrowa avatar May 06 '23 10:05 itrowa

Ran into this issue as well. Resolved by changing the temperature from 0 to 0.9.

It works

XinSong avatar May 14 '23 09:05 XinSong

Changing temperature from 0 to something else resolves the issue.

hossein-taghizad avatar May 20 '23 16:05 hossein-taghizad

For me it doesn't work with GPT4 model no matter how I adjusted the temperature. Code:

gpt4 = ChatOpenAI(temperature=0.5,model_name="gpt-4")
agent_004 = initialize_agent(tools, gpt4, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
with get_openai_callback() as cb:
    # output = agent_004.run("What is 32000 multiplied by 650 in the order of magnitude?")
    # since the previous output will throw an error at rounding, I test the following specifically
    output = agent_004.run(f"Round this number to an integer: {7.318063349}")
    print(cb)

Console:

> Entering new AgentExecutor chain...
I need to round the number to the nearest integer.
Action: Calculator
Action Input: round(7.318063349)

Error: ValueError: LLMMathChain._evaluate("round(7.318063349)") raised error: 'VariableNode' object is not callable. Please try again with a valid numerical expression

TX1999-pro avatar Jun 02 '23 20:06 TX1999-pro

When dealing with math questions, you want to set the temperature close to 0 to prevent hallucination. Instead clarify in the tool description that the model only should input math expressions:

llm_math_chain = LLMMathChain(llm=llm, verbose=True)
math_tool = Tool.from_function(
        func=llm_math_chain.run,
        name="Calculator",
        description="Useful for when you need to answer questions about math. This tool is only for math questions and nothing else. Only input math expressions.",
    )

For exception TypeError: 'VariableNode' object is not callable: LLM-math chain uses numexpr to evaluate the expression. numexpr doesn't support all operations yet, unfortunaltely round() is one among others: https://numexpr.readthedocs.io/en/latest/user_guide.html#supported-functions

Instead I created a custom tool that evaluates the expression with eval:

class CalculatorTool(BaseTool):
    name = "CalculatorTool"
    
    description = """
    Useful for when you need to answer questions about math.
    This tool is only for math questions and nothing else.
    Formulate the input as python code.
    """

    def _run(self, question: str):
        return eval(question)
    
    def _arun(self, value: Union[int, float]):
        raise NotImplementedError("This tool does not support async")

Niwood avatar Jun 04 '23 09:06 Niwood

Faced the same issue with Cohere . I was trying to learn about agents, here's my code

from langchain.llms import Cohere
from langchain.agents import load_tools, initialize_agent

llm = Cohere(cohere_api_key=cohere_api_key, temperature=0.1)
tools = load_tools(["wikipedia", "llm-math"], llm=llm)

agent = initialize_agent(tools=tools, llm=llm, agent="zero-shot-react-description", verbose="True")
agent.run("what is the name of the person who played the character of 'skyler white' in breaking bad and what are some other movies by the same actor. Also, how much time in years is 1 billion seconds?")

OUTPUT error:

ValueError: unknown format from LLM: 
60 * 60 * 24 * 365

...numexpr.evaluate("60 * 60 * 24 * 365")

Had to modify the question a few times for it to arrive at the "right thought", when it finally did, it ended up with the following error. Initially I had not set any temperature (default for cohere is 0.75, from their documentation).

I'm totally lost as it just doesn't arrive at the right answer(to the first question itself, instead of 'getting' that it was played by anna gunn, it keeps throwing out weird info about the show, its episodes. never reaches the second part of the question (at least not till I exhaust my free-trial usage).

Modified the question to something simpler, agent.run("how many children did 'skyler white' have in breaking bad. Also, how much time in years is 1 billion seconds?") and now it returns with irrelevant information:

> Finished chain.

Skyler White (née Lambert) is a fictional character in Breaking Bad, portrayed by Anna Gunn.

nothing about the billion seconds in yr part (besides answering something irrelevant instead of no. of children in the show)

pleb21 avatar Jun 10 '23 15:06 pleb21

I gave up on langChain math tool because I think that is not easily tunable, making agents very difficult to 'tame'. It would be better to write my own math tools to solve a very specific problems, and execute it myself. However, that would be dangerous if any of the answer was prompt-injected. The problem is the LLM has returned expression that the numexpr package do not support - that's an example and an inherent limitation - it's called hallucination. Setting higher temperature won't do us any good, as we are hoping it to jump around to write a correct code that functions.

I will try this implementation with a customised tool.

When dealing with math questions, you want to set the temperature close to 0 to prevent hallucination. Instead clarify in the tool description that the model only should input math expressions:

llm_math_chain = LLMMathChain(llm=llm, verbose=True)
math_tool = Tool.from_function(
        func=llm_math_chain.run,
        name="Calculator",
        description="Useful for when you need to answer questions about math. This tool is only for math questions and nothing else. Only input math expressions.",
    )

For exception TypeError: 'VariableNode' object is not callable: LLM-math chain uses numexpr to evaluate the expression. numexpr doesn't support all operations yet, unfortunaltely round() is one among others: https://numexpr.readthedocs.io/en/latest/user_guide.html#supported-functions

Instead I created a custom tool that evaluates the expression with eval:

class CalculatorTool(BaseTool):
    name = "CalculatorTool"
    
    description = """
    Useful for when you need to answer questions about math.
    This tool is only for math questions and nothing else.
    Formulate the input as python code.
    """

    def _run(self, question: str):
        return eval(question)
    
    def _arun(self, value: Union[int, float]):
        raise NotImplementedError("This tool does not support async")

TX1999-pro avatar Jun 10 '23 23:06 TX1999-pro

I gave up on langChain math tool because I think that is not easily tunable, making agents very difficult to 'tame'. It would be better to write my own math tools to solve a very specific problems, and execute it myself. However, that would be dangerous if any of the answer was prompt-injected.

Too early for me to give up on it, I barely understand the details right now. Thanks though.

The problem is the LLM has returned expression that the numexpr package do not support - that's an example and an inherent limitation - it's called hallucination. Setting higher temperature won't do us any good, as we are hoping it to jump around to write a correct code that functions.

thanks for breaking it down, I get it now. kind of.

I will try this implementation with a customised tool.

When dealing with math questions, you want to set the temperature close to 0 to prevent hallucination. Instead clarify in the tool description that the model only should input math expressions:

llm_math_chain = LLMMathChain(llm=llm, verbose=True)
math_tool = Tool.from_function(
        func=llm_math_chain.run,
        name="Calculator",
        description="Useful for when you need to answer questions about math. This tool is only for math questions and nothing else. Only input math expressions.",
    )

For exception TypeError: 'VariableNode' object is not callable: LLM-math chain uses numexpr to evaluate the expression. numexpr doesn't support all operations yet, unfortunaltely round() is one among others: https://numexpr.readthedocs.io/en/latest/user_guide.html#supported-functions Instead I created a custom tool that evaluates the expression with eval:

class CalculatorTool(BaseTool):
    name = "CalculatorTool"
    
    description = """
    Useful for when you need to answer questions about math.
    This tool is only for math questions and nothing else.
    Formulate the input as python code.
    """

    def _run(self, question: str):
        return eval(question)
    
    def _arun(self, value: Union[int, float]):
        raise NotImplementedError("This tool does not support async")

hopefully, will explore more on this. As of now, I was just experimenting (read trying to mash up various things without knowing wth i'm doing most of the time)

pleb21 avatar Jun 11 '23 00:06 pleb21

In my example, it somehow confused the first step (calculator instead of search) and then just failed. Sometimes, though, calculator is sufficient with the same prompt. Agent used search in this approach -- why?

llm = OpenAI(openai_api_key=os.environ["OPENAI_API_KEY"], temperature=0.7)

tools = load_tools(["serpapi", "llm-math"], llm=llm)

agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

agent.run("My monthly salary is 10000 KES, if i work for 10 months. How much is my total salary in USD in those 10 months.")

> Entering new AgentExecutor chain...
 I need to convert KES to USD
Action: Calculator
Action Input: 10000 KES to USD 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[9], line 1
----> 1 agent.run("My monthly salary is 10000 KES, if i work for 10 months. How much is my total salary in USD in those 10 months.")

File /opt/homebrew/lib/python3.10/site-packages/langchain/chains/base.py:451, in Chain.run(self, callbacks, tags, metadata, *args, **kwargs)
    449     if len(args) != 1:
    450         raise ValueError("`run` supports only one positional argument.")
--> 451     return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[
    452         _output_key
    453     ]
    455 if kwargs and not args:
    456     return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
    457         _output_key
    458     ]

File /opt/homebrew/lib/python3.10/site-packages/langchain/chains/base.py:258, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    256 except (KeyboardInterrupt, Exception) as e:
    257     run_manager.on_chain_error(e)
--> 258     raise e
    259 run_manager.on_chain_end(outputs)
    260 final_outputs: Dict[str, Any] = self.prep_outputs(
    261     inputs, outputs, return_only_outputs
    262 )

File /opt/homebrew/lib/python3.10/site-packages/langchain/chains/base.py:252, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    246 run_manager = callback_manager.on_chain_start(
    247     dumpd(self),
    248     inputs,
    249 )
    250 try:
    251     outputs = (
--> 252         self._call(inputs, run_manager=run_manager)
    253         if new_arg_supported
    254         else self._call(inputs)
    255     )
    256 except (KeyboardInterrupt, Exception) as e:
    257     run_manager.on_chain_error(e)

File /opt/homebrew/lib/python3.10/site-packages/langchain/agents/agent.py:1029, in AgentExecutor._call(self, inputs, run_manager)
   1027 # We now enter the agent loop (until it returns something).
   1028 while self._should_continue(iterations, time_elapsed):
-> 1029     next_step_output = self._take_next_step(
   1030         name_to_tool_map,
   1031         color_mapping,
   1032         inputs,
   1033         intermediate_steps,
   1034         run_manager=run_manager,
   1035     )
   1036     if isinstance(next_step_output, AgentFinish):
   1037         return self._return(
   1038             next_step_output, intermediate_steps, run_manager=run_manager
   1039         )

File /opt/homebrew/lib/python3.10/site-packages/langchain/agents/agent.py:890, in AgentExecutor._take_next_step(self, name_to_tool_map, color_mapping, inputs, intermediate_steps, run_manager)
    888         tool_run_kwargs["llm_prefix"] = ""
    889     # We then call the tool on the tool input to get an observation
--> 890     observation = tool.run(
    891         agent_action.tool_input,
    892         verbose=self.verbose,
    893         color=color,
    894         callbacks=run_manager.get_child() if run_manager else None,
    895         **tool_run_kwargs,
    896     )
    897 else:
    898     tool_run_kwargs = self.agent.tool_run_logging_kwargs()

File /opt/homebrew/lib/python3.10/site-packages/langchain/tools/base.py:349, in BaseTool.run(self, tool_input, verbose, start_color, color, callbacks, tags, metadata, **kwargs)
    347 except (Exception, KeyboardInterrupt) as e:
    348     run_manager.on_tool_error(e)
--> 349     raise e
    350 else:
    351     run_manager.on_tool_end(
    352         str(observation), color=color, name=self.name, **kwargs
    353     )

File /opt/homebrew/lib/python3.10/site-packages/langchain/tools/base.py:321, in BaseTool.run(self, tool_input, verbose, start_color, color, callbacks, tags, metadata, **kwargs)
    318 try:
    319     tool_args, tool_kwargs = self._to_args_and_kwargs(parsed_input)
    320     observation = (
--> 321         self._run(*tool_args, run_manager=run_manager, **tool_kwargs)
    322         if new_arg_supported
    323         else self._run(*tool_args, **tool_kwargs)
    324     )
    325 except ToolException as e:
    326     if not self.handle_tool_error:

File /opt/homebrew/lib/python3.10/site-packages/langchain/tools/base.py:491, in Tool._run(self, run_manager, *args, **kwargs)
    488 """Use the tool."""
    489 new_argument_supported = signature(self.func).parameters.get("callbacks")
    490 return (
--> 491     self.func(
    492         *args,
    493         callbacks=run_manager.get_child() if run_manager else None,
    494         **kwargs,
    495     )
    496     if new_argument_supported
    497     else self.func(*args, **kwargs)
    498 )

File /opt/homebrew/lib/python3.10/site-packages/langchain/chains/base.py:451, in Chain.run(self, callbacks, tags, metadata, *args, **kwargs)
    449     if len(args) != 1:
    450         raise ValueError("`run` supports only one positional argument.")
--> 451     return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[
    452         _output_key
    453     ]
    455 if kwargs and not args:
    456     return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
    457         _output_key
    458     ]

File /opt/homebrew/lib/python3.10/site-packages/langchain/chains/base.py:258, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    256 except (KeyboardInterrupt, Exception) as e:
    257     run_manager.on_chain_error(e)
--> 258     raise e
    259 run_manager.on_chain_end(outputs)
    260 final_outputs: Dict[str, Any] = self.prep_outputs(
    261     inputs, outputs, return_only_outputs
    262 )

File /opt/homebrew/lib/python3.10/site-packages/langchain/chains/base.py:252, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    246 run_manager = callback_manager.on_chain_start(
    247     dumpd(self),
    248     inputs,
    249 )
    250 try:
    251     outputs = (
--> 252         self._call(inputs, run_manager=run_manager)
    253         if new_arg_supported
    254         else self._call(inputs)
    255     )
    256 except (KeyboardInterrupt, Exception) as e:
    257     run_manager.on_chain_error(e)

File /opt/homebrew/lib/python3.10/site-packages/langchain/chains/llm_math/base.py:149, in LLMMathChain._call(self, inputs, run_manager)
    143 _run_manager.on_text(inputs[self.input_key])
    144 llm_output = self.llm_chain.predict(
    145     question=inputs[self.input_key],
    146     stop=["```output"],
    147     callbacks=_run_manager.get_child(),
    148 )
--> 149 return self._process_llm_result(llm_output, _run_manager)

File /opt/homebrew/lib/python3.10/site-packages/langchain/chains/llm_math/base.py:112, in LLMMathChain._process_llm_result(self, llm_output, run_manager)
    110     answer = "Answer: " + llm_output.split("Answer:")[-1]
    111 else:
--> 112     raise ValueError(f"unknown format from LLM: {llm_output}")
    113 return {self.output_key: answer}

ValueError: unknown format from LLM: This question cannot be solved using the numexpr library, as it is not a mathematical expression.

kirdin1 avatar Aug 08 '23 08:08 kirdin1

Hi, @cailynyongyong! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, the issue is related to encountering an error when trying to calculate a math problem using the numexpr library. There have been suggestions from users to change the temperature from 0 to 0.9 as a workaround, as well as creating a custom tool that evaluates the expression with eval instead of using the LLM-math chain.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain repository!

dosubot[bot] avatar Nov 07 '23 16:11 dosubot[bot]

@kirdin1 - Did you ever resolve your issues? Mine are very similar, also using llm-math and serpapi. I get partial execution until a second calculator operation, then I get all the error messages from various base.py files.

Robyred avatar Nov 16 '23 05:11 Robyred