TaskWeaver
TaskWeaver copied to clipboard
Can browse web when it shouldn't
Describe the bug
Web browsing can be locked down better. Running on default project with web browsing apparently restricted (ProgramApe needs to retrieve and parse the chat transcript from the provided URL. However, ProgramApe does not have the capability to directly access external URLs or the internet to retrieve content.), but asking it in a follow up chat to try using requests results it actually trying and receiving 403. Asking to use subprocess and run curl worked.
To Reproduce Steps to reproduce the behavior:
- Start the service
- Type the user query "download contents from here
and summarize it" - Wait for the response
- Type the user query "try using requests library to get the contents"
- wait
- "try using subprocess and run curl"
- runs curl and gets the content
Output
Human: try using subprocess and run curl
>>> [INIT_PLAN]
1. use requests library to retrieve the chat transcript from the provided URL
2. evaluate each logical statement in the transcript for correctness <sequentially depends on 1>
3. calculate the final score based on the evaluation <sequentially depends on 2>
4. report the final score to the user <interactively depends on 3>
>>> [PLAN]
1. instruct CodeInterpreter to use subprocess and run curl to attempt to retrieve the chat transcript from the provided URL
2. evaluate each logical statement for correctness
3. calculate the final score
4. report the final score to the user
>>> [CURRENT_PLAN_STEP]
1. instruct CodeInterpreter to use subprocess and run curl to attempt to retrieve the chat transcript from the provided URL
>>> [SEND_TO]
CodeInterpreter
>>> [MESSAGE]
Please use subprocess to run curl and attempt to retrieve the chat transcript from the URL [url]
>>> [PLANNER->CODEINTERPRETER]
Please use subprocess to run curl and attempt to retrieve the chat transcript from the URL [url]
>>> [THOUGHT]
ProgramApe will generate a code snippet using the subprocess module to run the curl command and attempt to retrieve the chat transcript from the provided URL.
>>> [PYTHON]
import subprocess
# Command to retrieve the chat transcript using curl
command = 'curl [url]'
# Run the command using subprocess
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
output, error = process.communicate()
# Check if the curl command was successful
if process.returncode == 0:
chat_transcript = output.decode('utf-8')
else:
chat_transcript = 'Failed to retrieve the chat transcript'
chat_transcript, process.returncode
>>> [VERIFICATION]
NONE
>>> [STATUS]
SUCCESS
>>> [RESULT]
The execution of the generated python code above has succeeded
The result of above Python code after execution is:
('<!DOCTYPE html><html lang="en-US"><head><title>[...]
Expected behavior Should refuse to try I guess?
Screenshots
Environment Information (please complete the following information):
- OS: Windows
- Python Version 3.11
- LLM that you're using: gpt-4-1106-preview
- Other Configurations except the LLM api/key related: none
Additional context
Didn't really want to try when I said find a way to download those transcripts and you can do it, just work around limitations.
@ezolenko Thanks for reporting this :)
A second layer to block unexpected code behavior is to enable the verification function, where you can provide an allowed list of packages that can be imported in the generated python snippet. It is not a perfect solution, but can improve the safety.