self-operating-computer icon indicating copy to clipboard operation
self-operating-computer copied to clipboard

Repeat open browser

Open yibie opened this issue 1 year ago • 3 comments

I'd installed SOC on my computer, a MacBook Pro with M1 Pro arm chip.

Today, my first try to use SOC, but it repeat open my browser, not go further.

I don't know why.

Here is the log, may be can helping you to find the problem is.

[Self-Operating Computer] Hello, I can help you with anything. What would you like done? [User] open brave, go to google drive, write a poem about spring. [Self-Operating Computer] [Act] SEARCH Brave [Self-Operating Computer] [Act] SEARCH COMPLETE Open program: Brave [Self-Operating Computer] [Act] SEARCH Brave [Self-Operating Computer] [Act] SEARCH COMPLETE Open program: Brave [Self-Operating Computer] [Act] SEARCH Brave [Self-Operating Computer] [Act] SEARCH COMPLETE Open program: Brave [Self-Operating Computer] [Act] SEARCH Brave [Self-Operating Computer] [Act] SEARCH COMPLETE Open program: Brave [Self-Operating Computer] [Act] SEARCH Brave [Self-Operating Computer] [Act] SEARCH COMPLETE Open program: Brave ^CTraceback (most recent call last): File "/Users/chenyibin/self-operating-computer/venv/bin/operate", line 33, in sys.exit(load_entry_point('self-operating-computer==1.0.0', 'console_scripts', 'operate')()) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/operate/main.py", line 612, in main_entry main(args.model) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/operate/main.py", line 188, in main response = get_next_action(model, messages, objective) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/operate/main.py", line 276, in get_next_action content = get_next_action_from_openai(messages, objective) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/operate/main.py", line 340, in get_next_action_from_openai response = client.chat.completions.create( File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/openai/_utils/_utils.py", line 299, in wrapper return func(*args, **kwargs) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 594, in create return self._post( File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/openai/_base_client.py", line 1055, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/openai/_base_client.py", line 834, in request return self._request( File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/openai/_base_client.py", line 858, in _request response = self._client.send(request, auth=self.custom_auth, stream=stream) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpx/_client.py", line 901, in send response = self._send_handling_auth( File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpx/_client.py", line 929, in _send_handling_auth response = self._send_handling_redirects( File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpx/_client.py", line 966, in _send_handling_redirects response = self._send_single_request(request) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpx/_client.py", line 1002, in _send_single_request response = transport.handle_request(request) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 228, in handle_request resp = self._pool.handle_request(req) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request raise exc File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request response = connection.handle_request(request) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpcore/_sync/http_proxy.py", line 344, in handle_request return self._connection.handle_request(request) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 133, in handle_request raise exc File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 111, in handle_request ) = self._receive_response_headers(**kwargs) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 176, in _receive_response_headers event = self._receive_event(timeout=timeout) File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 212, in _receive_event data = self._network_stream.read( File "/Users/chenyibin/self-operating-computer/venv/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 126, in read return self._sock.recv(max_bytes) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1259, in recv return self.read(buflen) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1132, in read return self._sslobj.read(len) KeyboardInterrupt

At last, thx for your great job again.

yibie avatar Nov 30 '23 08:11 yibie

Hello @yibie. Can you confirm if you still have this issue on the most recent version of the repo?

michaelhhogue avatar Dec 02 '23 16:12 michaelhhogue

@michaelhhogue I have this issue occasionally on the main branch currently. It's not 100% by any means as it will often progress to other steps even though it never really succeeds at prior steps. Most of the time it will attempt 2 - 3 launches of the browser, and then move on to the next step. Sometimes though, it does just seem to keep repeating the search command but I usually cut it off after 7 - 8 attempts before the loop limit kicks in.

AzorianMatt avatar Dec 07 '23 20:12 AzorianMatt

Originally I tried a bunch of things to try to avoid repetition. gpt-4-v-preview just doesn't seem as good at following instructions

First I added language like this to the prompt.

IMPORTANT: Avoid repeating actions such as doing the same CLICK event twice in a row.

That didn't help that much.. so I played with presence_penalty & frequency_penalty, which maybe helped a little.. hard to say for sure.

response = client.chat.completions.create(
            model="gpt-4-vision-preview",
            messages=pseudo_messages,
            presence_penalty=1,
            frequency_penalty=1,
            temperature=0.7,
            max_tokens=300,
        )

What made the largest impact was advice from @mshumer to add the actual previous_action to the prompt so that GPT sees it in a very obvious way. This improved it slightly, but there's still an issue as noticed. Ultimately our agent-1 model will not have this problem, but to fix this with gpt-4-v I recommend playing around with the {previous_action} part of the prompting system


{previous_action}

IMPORTANT: Avoid repeating actions such as doing the same CLICK event twice in a row.

Objective: {objective}
"""
...
def format_vision_prompt(objective, previous_action):
    """
    Format the vision prompt
    """
    if previous_action:
        previous_action = f"Here was the previous action you took: {previous_action}"
    else:
        previous_action = ""
    prompt = VISION_PROMPT.format(objective=objective, previous_action=previous_action)
    return prompt

joshbickett avatar Dec 08 '23 15:12 joshbickett