PatentChem icon indicating copy to clipboard operation
PatentChem copied to clipboard

urllib.error.HTTPError: HTTP Error 404: Not Found

Open AlanTanKX opened this issue 1 year ago • 1 comments

I have downloaded the files and created a new environment using the provided "environment.yml" file successfully.

However, I got an error message when running the following in the terminal: python download.py --years 2023 --data_dir .

The error message is below:

(patents) C:\Users\alant>python download.py --years 2023 --data_dir . Preparing to download all USPTO patents from 2023 ... Found 18 releases from 2023 Directory for 2023 already exists. Directory for 2023\I20230103 already exists. 2023\I20230103.tar: 0.00B [00:07, ?B/s] Traceback (most recent call last): File "C:\Users\alant\download.py", line 160, in main(args) File "C:\Users\alant\download.py", line 134, in main download_url( File "C:\Users\alant\download.py", line 73, in download_url urllib.request.urlretrieve(url, filename=output_path, reporthook=t.update_to) File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 239, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 523, in open response = meth(req, response) File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 632, in http_response response = self.parent.error( File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 561, in error return self._call_chain(*args) File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 641, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found

The URL that is in the code for download.py is correct and can be accessed through the browser, so I am confused as to why this error message was raised. This is the URL: https://bulkdata.uspto.gov/data/patent/grant/redbook/2023/. I got the same error message when running the code for other years.

Thanks very much for the help in troubleshooting this issue!

AlanTanKX avatar Apr 24 '23 07:04 AlanTanKX