hathitrustPDF Regex to get id_book is not correct in some cases.

Regex to get id_book is not correct in some cases.

Open ShenstonePorter opened this issue 1 year ago • 2 comments

When attempting to use the script to get a document with id = 'uva.x001941913' the regex as written results in id_book = 'uva.', because 'x' is not a digit.

This results in the script failing to get the pdf from Hathi, and instead produces a 6.0kb file for each page which contain Perl error messages.

The regex either needs to be tweaked, or perhaps just get the end user to duplicate the id from the URL they put into the script.

Jul 16 '23 20:07 ShenstonePorter

Thanks mate, this was a great hint!

Sep 19 '23 18:09 SchmueI

No worries (:*

Sep 19 '23 19:09 ShenstonePorter

hathitrustPDF hathitrustPDF copied to clipboard

Regex to get id_book is not correct in some cases.

hathitrustPDF
hathitrustPDF copied to clipboard