AutoGPT
AutoGPT copied to clipboard
Remove problematic check_duplicate_operation and calls
Background
Many issues (https://github.com/Significant-Gravitas/Auto-GPT/issues/1891#issuecomment-1519176961) have been spawned from the fact that check_duplicate_operation will prevent you from writing, appending, or deleting the same file more than once.
Changes
Removed the offending method and calls. I attempted to pursue a solution where we hashed the content and added it to each line of the file logging command, but then realized there were still cases where this would break (if we appended the same text to the same file twice, which is a perfectly valid operation and common as well). So, just removed the offending code.
Documentation
Code removed
Test Plan
Ran unittests and prompt asking chatGPT to test the specific scenario
PR Quality Checklist
- [x] My pull request is atomic and focuses on a single change.
- [x] I have thoroughly tested my changes with multiple different prompts.
- [x] I have considered potential risks and mitigations for my changes.
- [x] I have documented my changes clearly and comprehensively.
- [x] I have not snuck in any "extra" small tweaks changes
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.
Codecov Report
Patch coverage has no change and project coverage change: -8.10
:warning:
Comparison is base (
cade788
) 49.63% compared to head (471eccb
) 41.53%.
Additional details and impacted files
@@ Coverage Diff @@
## master #3126 +/- ##
==========================================
- Coverage 49.63% 41.53% -8.10%
==========================================
Files 64 64
Lines 3022 3014 -8
Branches 505 503 -2
==========================================
- Hits 1500 1252 -248
- Misses 1402 1699 +297
+ Partials 120 63 -57
Impacted Files | Coverage Δ | |
---|---|---|
autogpt/commands/file_operations.py | 55.55% <ø> (-1.25%) |
:arrow_down: |
... and 14 files with indirect coverage changes
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Codecov Report
Patch coverage: 94.11
% and project coverage change: +0.10
:tada:
Comparison is base (
aa3e37a
) 56.88% compared to head (94b0bd4
) 56.99%.
Additional details and impacted files
@@ Coverage Diff @@
## master #3126 +/- ##
==========================================
+ Coverage 56.88% 56.99% +0.10%
==========================================
Files 67 67
Lines 3043 3053 +10
Branches 509 513 +4
==========================================
+ Hits 1731 1740 +9
Misses 1174 1174
- Partials 138 139 +1
Impacted Files | Coverage Δ | |
---|---|---|
autogpt/commands/file_operations.py | 60.14% <94.11%> (+2.33%) |
:arrow_up: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Ignored Deployment
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
docs | ⬜️ Ignored (Inspect) | Visit Preview | Apr 28, 2023 8:57pm |
Please check out #1891. The solution we are looking for is not to remove the logger, but to fix it.
Please fix the logger instead of removing it
I can implement the suggestion you had on the other there where we hash the content and add this to the log line. However, for append_file, there are many cases where we need to add the same exact content to the same file which will break even with the new changes. I don't think appending the same text to the same file that we have in the past should be an error condition, what do you think?
there are many cases where we need to add the same exact content to the same file
Do you have an example?
This feature was implemented because agents would regularly end up in infinite loops, performing the same file operation over and over. That function should be preserved.
I think a large fraction of the "duplicate operation" errors can be mitigated by reading/parsing the log in a way that makes sure only the last relevant operations are taken into account to determine whether it is a duplicate or not. Ideas:
- Parsing logic that ignores operations which have already been overwritten (e.g. write -> delete or vice versa)
- Let the function provide clear feedback to the LLM
- "This content has already been written to the file. Continue to the next step."
there are many cases where we need to add the same exact content to the same file
Do you have an example?
This feature was implemented because agents would regularly end up in infinite loops, performing the same file operation over and over. That function should be preserved.
I think a large fraction of the "duplicate operation" errors can be mitigated by reading/parsing the log in a way that makes sure only the last relevant operations are taken into account to determine whether it is a duplicate or not. Ideas:
Parsing logic that ignores operations which have already been overwritten (e.g. write -> delete or vice versa)
Let the function provide clear feedback to the LLM
- "This content has already been written to the file. Continue to the next step."
Sure, here's one example. I would like an agent to track the price of a stock every 5 minutes and record it to a file. The stock starts at 100, goes to 105, then drops back down to 100:
write stocks.txt-427BB37B82189ED44CE979C59AA1CF3CDA2BBA42300B00743B0DA5EB14A63164 write stocks.txt-0F15C9804218AAFA9CA7E824F513AAB92797F4F24BA73A7FC6379559904B1768 write stocks.txt-427BB37B82189ED44CE979C59AA1CF3CDA2BBA42300B00743B0DA5EB14A63164 -> throws
This is easily solved by adding a timestamp from the client side to the appending content, but it may be a sub-par user experience on the way to finding a solution.
If the infinite loops are as big of an issue as you say, I agree we should keep it. I've updated the PR with the first steps (adding hash to log entries) and will see if we need to add a richer experience this week
I would like an agent to track the price of a stock every 5 minutes and record it to a file. The stock starts at 100, goes to 105, then drops back down to 100:
Something tells me there are less resource-intensive ways to do that ;)
But I get your point.
The process flow chart for the write_to_file function would be:
- Start
- Check if file exists at the specified filepath
- If file exists, read the content of the file
- Compare the content of the file with the new text to be written
- If the contents are the same, return an error message
- If the contents are not the same, set the mode to "append" and open the file
- If the file does not exist, set the mode to "write" and create a new file
- Write the new text to the file
- Log the operation
- Return a success message
- End
def write_to_file(filename: str, text: str) -> str:
"""Write text to a file
Args:
filename (str): The name of the file to write to
text (str): The text to write to the file
Returns:
str: A message indicating success or failure
"""
try:
filepath = path_in_workspace(filename)
if os.path.exists(filepath):
with open(filepath, "r", encoding="utf-8") as f:
if f.read() == text:
return "Error: File contents are the same as the new file trying to be written."
else:
mode = "a" # Append mode
else:
mode = "w" # Write mode
with open(filepath, mode, encoding="utf-8") as f:
f.write(text)
log_operation("write", filepath)
return "File written to successfully."
except Exception as e:
return f"Error: {str(e)}"
I would like an agent to track the price of a stock every 5 minutes and record it to a file. The stock starts at 100, goes to 105, then drops back down to 100:
Something tells me there are less resource-intensive ways to do that ;)
But I get your point.
For sure, typing it out makes me realize how silly it sounds. Either way, I re-added it for now, take a look
@bobisme has made a pretty clean PR (#3489) fixing this with checksums as well. Can you please review that @rocks6? It looks to be more complete than this PR, but I would be happy to get your input on it.
Closing this PR as superseded by #3489
Sounds good, I'll take a look