springer_free_books icon indicating copy to clipboard operation
springer_free_books copied to clipboard

HTTP Error 404 Not Found but I was able to manually download spreadsheet

Open dld2517 opened this issue 4 years ago • 4 comments

~/repos/springer_free_books$ python3 main.py Traceback (most recent call last): File "main.py", line 37, in books = pd.read_excel(table_url) File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/util/_decorators.py", line 188, in wrapper return func(*args, **kwargs) File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/util/_decorators.py", line 188, in wrapper return func(*args, **kwargs) File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/io/excel.py", line 350, in read_excel io = ExcelFile(io, engine=engine) File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/io/excel.py", line 653, in init self._reader = self._enginesengine File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/io/excel.py", line 402, in init filepath_or_buffer = _urlopen(filepath_or_buffer) File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.8/urllib/request.py", line 531, in open response = meth(req, response) File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response response = self.parent.error( File "/usr/lib/python3.8/urllib/request.py", line 569, in error return self._call_chain(*args) File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain result = func(*args) File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found

dld2517 avatar May 07 '20 04:05 dld2517

Requirement already satisfied: pandas in /home/ddarden/.local/lib/python3.8/site-packages (0.24.2) Requirement already satisfied: python-dateutil>=2.5.0 in /home/ddarden/.local/lib/python3.8/site-packages (from pandas) (2.8.1) Requirement already satisfied: numpy>=1.12.0 in /home/ddarden/.local/lib/python3.8/site-packages (from pandas) (1.16.6) Requirement already satisfied: pytz>=2011k in /home/ddarden/.local/lib/python3.8/site-packages (from pandas) (2019.3) Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.5.0->pandas) (1.14.0)

dld2517 avatar May 07 '20 04:05 dld2517

Springer had updated the Excel file with a different name, but the Python script tried to download from the old link; therefore the error you encountered. Alex has fixed the link issue (see #85). Try downloading/cloning the repo again.

chaosAD avatar May 07 '20 06:05 chaosAD

Still didn't work. I used git fetch to redownload it and got the same issue. I think I'm done with the python mess. I just used the spreadsheet and created the url's via a concat function and used wget -O to download them.

dld2517 avatar May 07 '20 16:05 dld2517

After git fetch, did you git merge? If you didn't, it wouldn't be in your working directory and therefore you were still running the older script. I suggest git pull command rather than git fetch. But beware that this would work smoothly if you hadn't modified the code in the working directory. In my opinion, the best way is to start off with a clean slate by issuing git clone or download the zip in GitHub. This would have saved you all the trouble.

In fact, I did suggest to you to try downloading/cloning the repo again in my previous post.

chaosAD avatar May 08 '20 06:05 chaosAD