langchain PythonREPL agent toolkit does not recognize PythonREPL as a valid tool

System Info

Langchain version: 0.0.172 Platform: Linux (Ubuntu)

Who can help?

@vowelparrot

Information

[ ] The official example notebooks/scripts
[ ] My own modified scripts

Related Components

[ ] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[X] Agents / Agent Executors
[X] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

When trying to use a python agent for simple queries, the agent often does not recognize Python REPL as a valid tool:

> Entering new AgentExecutor chain...
I can write a function to generate the nth fibonacci number and then call it with n=4.
Action: [Python REPL]
Action Input:

def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(4))

Observation: [Python REPL] is not a valid tool, try another one.

My instantiation of the agent:

custom_prefix = PREFIX + "When writing a code block, do not include the word 'python' after the first three ticks. You must graph your findings and save a .png of the graph on the local file system at [a path on my local machine]. The corpus consists of .txt files at this directory: [another path on my machine]."

python_agent = create_python_agent(llm=llm, tool=PythonREPLTool(), verbose=True, max_tokens=1000, prefix=custom_prefix)

python_agent("What are the top 10 bi-grams in the corpus? Only parse the .txt files. Translate the final n-grams to English for the chart.")

Model then gets stuck in a loop trying to use Python REPL. I included instructions in custom_prefix because the model repeatedly got this error, too:

> Entering new AgentExecutor chain...
I need to read in all the .txt files in the corpus and tokenize them into bi-grams. Then I need to count the frequency of each bi-gram and return the top 10.
Action: Python REPL
Action Input:
```python
import os
import nltk
from collections import Counter
from nltk import word_tokenize
from nltk.util import ngrams

corpus_dir = a directory on my machine
files = [os.path.join(corpus_dir, f) for f in os.listdir(corpus_dir) if f.endswith('.txt')]

bi_grams = []
for file in files:
    with open(file, 'r') as f:
        text = f.read()
        tokens = word_tokenize(text)
        bi_grams += ngrams(tokens, 2)

bi_gram_freq = Counter(bi_grams)
top_10_bi_grams = bi_gram_freq.most_common(10)
...
print(top_10_bi_grams)
```NameError("name 'python' is not defined")

Expected behavior

I expect the Python agent to recognize PythonREPL as a valid tool. In fact, sometimes it does! But more often than not, it does not recognize PythonREPL as a tool. The query I included in the above code snippet worked maybe once in every 50 tries.

May 17 '23 23:05 ethanjevvell

Thanks for flagging. The model being used isn't outputting the expected format.

In the first case, it's adding brackets [] aruond the name. You could relax the output parser of the agent to account for this.

In the second case, it's adding the python word in the markdown, which can be handled in the parser as well.

We will work on evaluating the current prompts to make this easier in the future

May 17 '23 23:05 vowelparrot

Thanks @vowelparrot. This appears to be an issue with the base Python agent toolkit, then, as I am not using any customization (aside from changing the prefix in an attempt to fix the bugs I mentioned).

May 17 '23 23:05 ethanjevvell

What LLM are you using?

May 18 '23 00:05 vowelparrot

@vowelparrot gpt-3.5-turbo

Apologies for not mentioning this sooner

May 18 '23 00:05 ethanjevvell

Ya we have to tune everything for chat models. Turbo doesn't work well with a lot of the older prompts. It's mostly an issue with the agent prompt rather than this specific tool.

Thanks for flagging

May 18 '23 01:05 vowelparrot

it's adding brackets [] aruond the name. You could relax the output parser of the agent to account for this.

I'm using create_pandas_dataframe_agent() with GPT3.5 turbo and same problem happened. I debugged the agent prompt and found below instruction. Action: the action to take, should be one of [python_repl_ast]

I think this caused the problem of "[] aruond the name". If there is only one tool, the agent prompt should be more simple like below. Action: the action to take, should be python_repl_ast

Jun 01 '23 07:06 nai-kon

I'm not sure what I've met is same or not. The console keeps showing:

Observation: <A sentence of AI's thought> is not a valid tool, try another one.
Thought:

LLM: gpt-3.5-turbo Using Davinci works like a charm, but 10x more expensive which is not affordable.

appreciate if providing any suggestion or work around.

Jun 26 '23 01:06 catskytw

it's adding brackets [] aruond the name. You could relax the output parser of the agent to account for this.

I'm using create_pandas_dataframe_agent() with GPT3.5 turbo and same problem happened. I debugged the agent prompt and found below instruction. Action: the action to take, should be one of [python_repl_ast]

I think this caused the problem of "[] aruond the name". If there is only one tool, the agent prompt should be more simple like below. Action: the action to take, should be python_repl_ast

How do i change the prompt? Thanks

Sep 05 '23 15:09 K-Jadeja

Hi, @ethanjevvell,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, you reported an issue with the PythonREPL agent toolkit not being consistently recognized by the Python agent, causing errors and the model to get stuck in a loop. Vowelparrot acknowledged the issue and suggested relaxing the output parser of the agent to account for the unexpected format. Other users, such as nai-kon and K-Jadeja, also shared their experiences with similar problems. Vowelparrot mentioned that the issue is likely with the agent prompt rather than the specific tool and that tuning is required for chat models like GPT-3.5 turbo.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation.

Dec 06 '23 17:12 dosubot[bot]

Use python_repl_ast to solve the equation for ( x ). is not a valid tool, try one of [python_repl_ast]

Jun 06 '24 13:06 dushyantnagarhumanli

langchain langchain copied to clipboard

PythonREPL agent toolkit does not recognize PythonREPL as a valid tool

System Info

Who can help?

Information

Related Components

Reproduction

Expected behavior

langchain
langchain copied to clipboard