AutoGPT icon indicating copy to clipboard operation
AutoGPT copied to clipboard

Stop `execute_shell` going interactive

Open claytondukes opened this issue 1 year ago • 22 comments

Duplicates

  • [X] I have searched the existing issues

Steps to reproduce 🕹

I ran a setup last night to assist in writing some proposals. It got to a point where it thought it needed to edit files and ran gedit (and it's not even running on a desktop OS), then tried to run nano - neither of which would work since it's not capable of using tools like that.

Is there some way to constrain or tell it that using interactive tools like that won't work?

Current behavior 😯

Apparently json was fixed. 
NEXT ACTION:  COMMAND = write_to_file ARGUMENTS = {'file': 'proposal.txt', 'text': 'Proposal Outline:\n1. Abstract\n2. Background\n3. Methodology\n4. Expected Outcomes\n5. Budget'}
SYSTEM:  Command write_to_file returned: File written to successfully.
SBIR AI THOUGHTS: 
REASONING: 
CRITICISM: 
Attempting to fix JSON by finding outermost brackets 
Apparently json was fixed. 
NEXT ACTION:  COMMAND = execute_shell ARGUMENTS = {'command_line': 'gedit proposal.txt'}
Executing command 'gedit proposal.txt' in working directory '/app/auto_gpt_workspace'
SYSTEM:  Command execute_shell returned: STDOUT: b'' STDERR: b'/bin/sh: 1: gedit: not found\n'
SBIR AI THOUGHTS: 
REASONING: 
CRITICISM: 
Attempting to fix JSON by finding outermost brackets 
Apparently json was fixed. 
NEXT ACTION:  COMMAND = execute_shell ARGUMENTS = {'command_line': 'nano proposal.txt'}
Executing command 'nano proposal.txt' in working directory '/app/auto_gpt_workspace'

Expected behavior 🤔

The AI should know, or even infer, that it can't use programs like that.

Your prompt 📝

ai_goals:
  - a Whitepaper for XXX and save the file
  - Create a proposal for the YYY program on "Foo Bar Baz"
  - Do not create files with placeholder text, use write_to_file when you have actual data to put in them
  - keep track of any files that you have created. If you try to access a file that you think you created but it isn't there, create it. If you are still unable to create the file after 2 tries, log an error and discontinue your tasks.
  - if you fail to parse AI output, show the actual AI output for debugging purposes
  - if you start a GPT agent, keep in mind that the GPT agent does not have information after the year 2021 and does not have internet access
  - Learn AAA using  https://<foo>
  - Learn about BBB using https://<bar>
  - Learn about CCC using https://<baz/file.pdf>
  - Find topics that XXX can solve (or aligns with)
  - Useful information for your research;
    XXX's Documentation, https://docs.x.net
    XXX's Website, https://www.x.net
    XXX's API Docs, https://api.x.io

ai_name: Proposal AI
ai_role: an AI designed to help XXX do blah. Specifically, to offer a training course that helps the <redacted> learn, build, deploy, and test <redacted>

claytondukes avatar Apr 14 '23 13:04 claytondukes

lol, yeah i just ran into it trying to use nano as well. I forced nano to quit, and was able to resume though.

Slowly-Grokking avatar Apr 14 '23 17:04 Slowly-Grokking

killall nano

JamestheDon avatar Apr 14 '23 21:04 JamestheDon

Ya a few different interactive shark commands that seem to break the loop. Hope there's a solution

gondar7 avatar Apr 16 '23 15:04 gondar7

though it would be nice if it could remote control an editor so we can watch it write the code ;-)

swarm4it avatar Apr 18 '23 14:04 swarm4it

yeah, i just ran into this too, and went to see if it was fixed, I wonder if it is possible to give AutoGPT the ability to use nano

KapDEK avatar Apr 18 '23 17:04 KapDEK

I think the new popen shell is for interactive commands now. going to try it out

gondar7 avatar Apr 18 '23 18:04 gondar7

it seems to work... as in it leaves the subprocess open... but it doesnt seem to interact with it after it opens it. In the code, it looks like it doesn't accept output. Not sure this could work for interactive commands, might need something more like pexpect?

gondar7 avatar Apr 18 '23 18:04 gondar7

Same issue here. So far I told it about the problem, it then installed notepad++ (which requires a Y prompt it did not put in the choco command and so I had to type it in for it) and then it opened Notepad++ and also could not interact with it. Finally it used an echo command to write to the file which did work.......and then resumed attempting to use nano ignoring that it was failing to write to the file. Until this is fixed will test out telling it to either use the write_to_file command exclusively or to use echo to create/append texts.

TheNitzel avatar Apr 20 '23 23:04 TheNitzel

@TheNitzel : may i ask how you told auto-GTP about the problem ? via shell ? I tried to add "not to use nano or vi" to the initial goals but that doesn't seem to work. I don't understand how auto-GPT is not able to write in or create a file, seems to me it's doing that all the time, i mean it's succesfully creating and updating auto-gpt.json for instance.

bassie661 avatar Apr 25 '23 13:04 bassie661

I don't understand how auto-GPT is not able to write in or create a file, seems to me it's doing that all the time, i mean it's succesfully creating and updating auto-gpt.json for instance.

There isn't a command that AUTOGPT AI is using to update the auto-gpt.json file, that's hardcoded in the script. write_to_file works as does append. Asking it to use interactive shells or GUIs is like asking a blind and deaf person to be a stenographer at this point in time.

Until OCR, mouse and keyboard emulation are supported, it can't interact with any GUI. Everything it can do needs to be done via command line.

how you told auto-GTP about the problem ?

You can try various prompting strategies, but if you see "Next Action: Execute Shell" and it's trying to use nano/vi or anything you don't want it to do, give it human feedback.

Slowly-Grokking avatar Apr 25 '23 15:04 Slowly-Grokking

OK thanks, i never asked it to use interactive shells or GUIs, it's going on a GUI tour all by itself, hmm, I will put some more into reading the issues and documentation, I haven't got my mind yet around the exact way this auto-GPT is working, probably have a wrong image of it. Looks to me it's starting up agents for all sorts of tasks and in my case one of the agents is trying to create a python script, that initially seems to go well, until he opens nano / vi to write to or append a file and get's stuck.

Dunno why it just doesn't just stick to command line then. Hope I can somehow let it use only command line for files. It would probably saved me 10 restarts :)

bassie661 avatar Apr 25 '23 17:04 bassie661

though it would be nice if it could remote control an editor so we can watch it write the code ;-)

some editors, like vim, definitely support a client/server mode - it would probably be a dedicated plugin to make that work, and it would have to work via popen probably, and it would benefit from the concept of "channels" to talk to other "server-like" processes.

In the meantime, one could probably use some strace-like script to look for problematic API calls and make such processes return control to Auto-GPT if they are exhibiting "blocking" behavior: To deal with binaries that may optionally use blocking calls on a Linux system where only non-blocking binaries are permitted, you can take several approaches. One option is to wrap the binary with a script that intercepts any blocking calls and terminates the process if it becomes blocked, using a tool like "strace" to trace system calls. Another option is to redirect blocking calls to non-blocking equivalents using the "fcntl" system call to set file descriptors to non-blocking mode.

Boostrix avatar Apr 30 '23 12:04 Boostrix

why does it need to use an editor though? it can write everything to files on it's own according to syntax. what would the use of an editor even be?

katmai avatar Apr 30 '23 12:04 katmai

Fun!

On Sun, Apr 30, 2023 at 07:27 katmai @.***> wrote:

why does it need to use an editor though? it can write everything to files on it's own according to syntax. what would the use of an editor even be?

— Reply to this email directly, view it on GitHub

zudsniper avatar Apr 30 '23 12:04 zudsniper

Fun! On Sun, Apr 30, 2023 at 07:27 katmai @.> wrote: why does it need to use an editor though? it can write everything to files on it's own according to syntax. what would the use of an editor even be? — Reply to this email directly, view it on GitHub <#1327 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2U6HMAGL7OITUENRFN4WDXDZLDTANCNFSM6AAAAAAW6OP4ZQ . You are receiving this because you are subscribed to this thread.Message ID: @.> -- Sincerely, Jason P. McElhenney

makes sense :D

katmai avatar Apr 30 '23 12:04 katmai

I've seen it (successfully!) using sed to edit/patch existing files - then again, "an editor" (nano, vi etc) is probably just lingo for any interactive "app" (#346) here?

It getting stuck inside nano/vim seems to happen for some folks - thus, the heuristics to detect such binaries/situations would potentially still be worthwhile ?

Also, if the sub-agent approach is pursued, a parent-agent would be in a position to monitor what's going on and it could observe/terminate a sub-agent as needed, including if it's obviously got stuck invoking a system call that is blocking.

Boostrix avatar Apr 30 '23 12:04 Boostrix

Hi there, I don't like the timeout solution, I am using a special character to detect the input prompt.

You can have a look https://stackoverflow.com/q/76097868/10294022

tomtom94 avatar May 01 '23 05:05 tomtom94

For future reference, here's what GPT-4 came up with to detect an idle child process that seems to be blocking because it's waiting for I/O without actually changing its RAM/CPU utilization and without sending data to the parent process:

import ctypes import os import psutil import time

libc = ctypes.CDLL('libc.so.6')

def hooked_read(fd, buf, count): print(f"Process with PID {os.getpid()} is reading {count} bytes from file descriptor {fd}") return libc.read(fd, buf, count)

libc.read.restype = ctypes.c_ssize_t libc.read.argtypes = [ctypes.c_int, ctypes.c_void_p, ctypes.c_size_t]

Replace the read function with the hooked version

libc.read = hooked_read

Launch the target process using popen

process = os.popen('ls') cpu_history = [] memory_history = [] output_history = []

Monitor the process's CPU and RAM utilization and output

while process.poll() is None: cpu_percent = psutil.Process(process.pid).cpu_percent(interval=1) memory_info = psutil.Process(process.pid).memory_info().rss / 1024 / 1024 output = process.read() output_history.append(output) print(f"Process with PID {process.pid} is using {cpu_percent:.2f}% CPU and {memory_info:.2f} MB of RAM, output so far: {output}")

# Add CPU, memory, and output data to history
cpu_history.append(cpu_percent)
memory_history.append(memory_info)

# Check if the process seems blocked
if len(cpu_history) >= 5 and all(cpu_percent < 1 for cpu_percent in cpu_history[-5:]) and all(memory_info == memory_history[-1] for memory_info in memory_history[-5:]):
    print("Process seems to be blocked")
    # Check if the process is using a blocking API call
    if all(output == output_history[-1] for output in output_history[-5:]):
        print("Process seems to be using a blocking API call")
        timeout = 10 # Timeout in seconds (defaulted to 10)
        start_time = time.time()
        while time.time() - start_time < timeout:
            cpu_percent = psutil.Process(process.pid).cpu_percent()
            memory_info = psutil.Process(process.pid).memory_info().rss / 1024 / 1024
            if cpu_percent > 0 or memory_info > 0:
                break
            time.sleep(1)
        else:
            print(f"Process with PID {process.pid} is being killed due to timeout")
            process.kill()
            break

Wait for the process to finish and print its output

output = process.read() print(output)

Boostrix avatar May 01 '23 06:05 Boostrix

saw it once again today, despite previously having used sed successfully, it wanted to start nano and vim. I suppose, the blacklist option mentioned before would be a simple workaround. So that the .env file can be used to explicitly disable certain shell commands like these.

Also, another user is currently working on a new "update_file" command which should hopefully help. Alternatively, we could introduce an "CLI_EDIT" command that is explicitly constrained by its description.

Boostrix avatar May 02 '23 08:05 Boostrix

I believe ai tries to open editors because it can't find a useful command in its own code. Since I am using my update_file, I have not been getting attempts to open editors unless I specifically ask for it. My PR #3643

bfalans avatar May 04 '23 10:05 bfalans

We probably need a bunch of aliases to cover all cases (edit, update, rewrite, change, modify etc) and redirect the llm to use a BIF

Boostrix avatar May 04 '23 10:05 Boostrix

We probably need a bunch of aliases to cover all cases (edit, update, rewrite, change, modify etc) and redirect the llm to use a BIF

I am ready for @Boostrix PR you're all up in these issues and from what I've seen you have the right idea 83% or more. If you want any assistance with creating what you want I'm always too busy but I will make time to help get all your theories and strategies into code. Please reach out to me if you are interested in this! All my socials are in my gh profile README but discord will be the fastest. Cheers

zudsniper avatar May 04 '23 10:05 zudsniper

It's now possible to configure which shell commands are permitted or prohibited, which can be useful in preventing the use of interactive commands.

lc0rp avatar Jun 13 '23 07:06 lc0rp

Keep in mind that interactive programs can go non interactive and vice versa...

Boostrix avatar Jun 13 '23 10:06 Boostrix