AutoGPT
AutoGPT copied to clipboard
en_core_web_sm installs every time from run.bat
i think this is due to how the cheeck requirement python file gathers and compares installed packages against required packages, in that "en-core-web-sm"
shows up in installed packages while "en_core_web_sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.0/en_core_web_sm-3.4.0-py3-none-any.whl"
shows up in required packages. i am not sure how to elegantly fix this as python package names can get messy at times.
Additional information:
Environment:
Git commit hash : 4eaec804386b84a9aba21791ef0fb7b53d8bdd28 on master
MacOS : Darwin MacBook-Pro.local 22.4.0 Darwin Kernel Version 22.4.0: Mon Mar 6 21:01:02 PST 2023; root:xnu-8796.101.5~3/RELEASE_ARM64_T8112 arm64
Python 3.11.3
using pip install -r requirements.txt
results in re-download (run.sh also results in the same)
Collecting en-core-web-sm@ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.5.0/en_core_web_sm-3.5.0-py3-none-any.whl (from -r requirements.txt (line 24))
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.5.0/en_core_web_sm-3.5.0-py3-none-any.whl (12.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.8/12.8 MB 1.1 MB/s eta 0:00:00
Requirement already satisfied: beautifulsoup4>=4.12.2 in /opt/homebrew/lib/python3.11/site-packages (from -r requirements.txt (line 1)) (4.12.2)
Additional information:
Interestingly under devcontainers, the behavior is different :
ie, on subsequent starts the file is not getting downloaded.
I'm having the same issue. And on top of that, when I try to run it, after all the installs, it tries to tell me all kinds of similar errors to: " raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) "
same issue.
Here I'm going to include the logs from the installation error and the API error
`Missing packages: spacy>=3.0.0,<4.0.0, en_core_web_sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.0/en_core_web_sm-3.4.0-py3-none-any.whl Installing missing packages... Defaulting to user installation because normal site-packages is not writeable Collecting en_core_web_sm@ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.0/en_core_web_sm-3.4.0-py3-none-any.whl (from -r requirements.txt (line 24)) Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.0/en_core_web_sm-3.4.0-py3-none-any.whl (12.8 MB) ---------------------------------------- 12.8/12.8 MB 10.9 MB/s eta 0:00:00 ...
Long list of requirements already satisfied`
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! And now here is the other error !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
`Traceback (most recent call last):
File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\urllib3\connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\requests\adapters.py", line 489, in send
resp = conn.urlopen(
File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\urllib3\connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\urllib3\util\retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\urllib3\packages\six.py", line 769, in reraise
raise value.with_traceback(tb)
File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\urllib3\connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\openai\api_requestor.py", line 516, in request_raw result = _thread_context.session.request( File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\requests\sessions.py", line 587, in request resp = self.send(prep, **send_kwargs) File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\requests\sessions.py", line 701, in send r = adapter.send(request, **kwargs) File "C:\Users\Nessi\AppData\Roaming\Python\Python310\site-packages\requests\adapters.py", line 547, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Program Files\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\Nessi\Documents\Auto-GPT-0.2.2\autogpt_main.py", line 5, in
It MAY have been something to do with uninstalling and reinstalling between en_core_web_sm-3.4.0, en_core_web_sm-3.4.1, launching through running the .bat file, launching through running .\run.bat in the terminal, and reinstalling the requirements. I may have done it in some magic specific order
Noting that it does NOT encounter this issue at all when running from a container however it gets stuck on "thinking" and only once has it gotten past that, it gave thoughts, reasoning and criticism, but then just stopped. Pressing y and hitting enter did nothing.
This is a setup issue. Try and work with the team in #tech-support on the discord to get a fix
Cause
@simin75simin was on the right track about versioning. It is my strong belief that check_requirements.py assumption that all package name-version pairs are separated by a == as in this example pinecone-client==2.2.1, is the one of two root causes of this.
This is not the case for either spacy or en_core_web_sm, so I don't see how anyone could execute check_requirements.py or another script (run.bat) calling check_requirements.py, without it trying to install those two packages every time it was called.
Relevant requirements.txt Lines
As you can see, neither one of these packages have a double equal trailing the package names. Since the data is not properly parsed, neither one of these packages would be detected.
spacy>=3.0.0,<4.0.0
en_core_web_sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.0/en_core_web_sm-3.4.0-py3-none-any.whl
The Original check_requirements.py
Take a look at how required_packages is being initialized with lines which are partially cleaned up with:
line.strip().split("#")[0].strip()
Then down in the for loop when each value is iterated, the value is manipulated a second time with:
package.strip().split("==")[0]
Which is effectively the same as using the following line. After all those calls the array is still filled with empty values.
line.strip().split("#")[0].strip().strip().split("==")[0]
The main problem is just the use of the ==. Cleaning this up is easy, but predicting what PIP puts in requirements.txt seems to be more of a challenge.
import sys
import pkg_resources
def main():
requirements_file = sys.argv[1]
with open(requirements_file, "r") as f:
required_packages = [
line.strip().split("#")[0].strip() for line in f.readlines()
]
installed_packages = [package.key for package in pkg_resources.working_set]
missing_packages = []
for package in required_packages:
if not package: # Skip empty lines
continue
package_name = package.strip().split("==")[0]
if package_name.lower() not in installed_packages:
missing_packages.append(package_name)
if missing_packages:
print("Missing packages:")
print(", ".join(missing_packages))
sys.exit(1)
else:
print("All packages are installed.")
if __name__ == "__main__":
main()
Want a good laugh?
Though I debugged the issue myself, I decided to use gpt-4 to re-write the script. I asked it to account for pip version operators, to remove all of the empty lines before populating the list and combine the manipulation of the data into the same location.
ChatGPT-4's Updated Script
import sys
import re
import pkg_resources
def main():
requirements_file = sys.argv[1]
with open(requirements_file, "r") as f:
required_packages = [
re.split('==|>=|<=|>|<| @ ', line.split("#")[0].strip())[0]
for line in f.readlines()
if line.strip() and not line.startswith("#")
]
installed_packages = [package.key for package in pkg_resources.working_set]
missing_packages = []
for package in required_packages:
if package.lower() not in installed_packages:
missing_packages.append(package)
if missing_packages:
print("Missing packages:")
print(", ".join(missing_packages))
sys.exit(1)
else:
print("All packages are installed.")
if __name__ == "__main__":
main()
There is Actually yet one Last Issue
I thought the AI had an error in its new updated script, but upon a closer look I realized that though it resolved the problem with spacy, the updated script still wanted to install en_core_web_sm. I am wondering if the line for en_core_web_sm was edited by hand? I removed that package from my system, let the same line in the requirements.txt force reinstall and then I created a new requirements.txt using pip.
That second issue is caused by the difference between _ underscores and - dashes. Meaning the given requirements.txt from this repo's stable uses underscores en_core_web_sm while when comparison is made on my system, or when I generated a requirements.txt using pip freeze, it uses en-core-web-sm.
TLDR
- == won't gaurantee clean parsing of a package names without any additional text like the > or ' @ URL' involved with two of the packages in the given requirements.txt
- ChatPGT-4 wrote a new version of the check_requirements.py script for us, its included above.
- Using a corrected script will still lead to a reattempt to install en_core_web_sm because pkg_resources.working_set and the pip freeze command I ran on my system both report that package as having dashes in between characters and not underscores.
Please note that as I started responding to this the issue was open, but closed during the time it took me between other tasks and trying to nail all of the details. I am not asking for a fix, don't care if you use the updated script, but hoped that at least @simin75simin might benefit from this.