hathitrustPDF icon indicating copy to clipboard operation
hathitrustPDF copied to clipboard

the code is not working

Open Git-Vasanth opened this issue 1 year ago • 9 comments

Screenshot (5152) Screenshot (5152) C:\Users\Vasanth\AppData\Local\Programs\Python\Python312\python.exe C:\Users\Vasanth\Hathi_downs\hathitrustPDF.py C:\Users\Vasanth\Hathi_downs\hathitrustPDF.py:18: SyntaxWarning: invalid escape sequence '\w' id_book = re.findall('id=(\w*.\d*)|$', link)[0] C:\Users\Vasanth\Hathi_downs\hathitrustPDF.py:62: SyntaxWarning: invalid escape sequence '\D' key=lambda x: (int(re.sub('\D', '', x)), x)) Traceback (most recent call last): File "C:\Users\Vasanth\Hathi_downs\hathitrustPDF.py", line 23, in pages_book = int(soup.find("section", {'class': 'd--reader--viewer'})['data-total-seq']) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptable

Process finished with exit code 1

Git-Vasanth avatar Dec 17 '23 15:12 Git-Vasanth

Same here, it looks like the structure of the HathiTrust rendering page has changed

Traceback (most recent call last): File "hathitrustPDF.py", line 23, in pages_book = int(soup.find("section", {'class': 'd--reader--viewer'})['data-total-seq']) TypeError: 'NoneType' object is not subscriptable

smorello87 avatar Jan 26 '24 12:01 smorello87

It looks like @Midnight145 made a bunch of progress in updating the code, but HathiTrust has updated how they render the pages. See the README info: https://github.com/Midnight145/hathitrustPDF/tree/master

ryanbugden avatar Aug 13 '24 04:08 ryanbugden

Yup. I'm not familiar enough with web development to continue it myself, but feel free to contribute to it if anyone has the knowhow to fix it.

Midnight145 avatar Aug 13 '24 19:08 Midnight145

Actually, looking into this, I might be able to work around it. I'll try and figure something out and update my repo if I can find anything.

Midnight145 avatar Aug 13 '24 20:08 Midnight145

Just pushed!! Realized I was wayyy overcomplicating things and I was trying to make things way harder than they needed to be. Only thing that was actually different was where the page count was located on the site.

Midnight145 avatar Aug 13 '24 20:08 Midnight145

@Midnight145 Woo, nice, I'll check it out, thanks for looking into this!

ryanbugden avatar Aug 13 '24 20:08 ryanbugden

No problem! Please open an issue on my end if you run into any troubles!

Midnight145 avatar Aug 13 '24 20:08 Midnight145

@Midnight145 Looks to be working well on my end! For the future, you may need to enable issues on your fork to keep this one clean though.

Just learned how:

  • Go to the Settings page of your fork.
  • Click 'General' tab on the left
  • Scroll down to the Features subsection
  • Tick the Issues checkbox

ryanbugden avatar Aug 13 '24 22:08 ryanbugden

Whoops, didn't realize that wasn't enabled. Will fix that real quick!

Midnight145 avatar Aug 13 '24 22:08 Midnight145