hathitrustPDF
hathitrustPDF copied to clipboard
Regex to get id_book is not correct in some cases.
When attempting to use the script to get a document with id = 'uva.x001941913' the regex as written results in id_book = 'uva.', because 'x' is not a digit.
This results in the script failing to get the pdf from Hathi, and instead produces a 6.0kb file for each page which contain Perl error messages.
The regex either needs to be tweaked, or perhaps just get the end user to duplicate the id from the URL they put into the script.
Thanks mate, this was a great hint!
No worries (:*