HummusJS icon indicating copy to clipboard operation
HummusJS copied to clipboard

Unable to start parsing PDF file

Open SAanish opened this issue 7 years ago • 11 comments

var pdfReader = hummus.createReader(sourcePath); pageNumber=pdfReader.getPagesCount()

SAanish avatar Jul 26 '17 05:07 SAanish

maybe the path is wrong? maybe its not a pdf? this is fairly basic stuff

galkahana avatar Jul 26 '17 13:07 galkahana

Run into the same issue with this pdf file. Please help

Path is correct. Only issue potentially from the pdf itself tempDoc.pdf

Looking forward to any advise.

Jackychans avatar Aug 21 '17 12:08 Jackychans

Hello, i run into the same error 👍

In my case it was observed only on pdf version 1.3, however as jackychans shows us it's also for 1.7

Same case, path and data are correct, it comes from hummus.createReader() on nodejs.

yogalink avatar Aug 22 '17 12:08 yogalink

you'll need to send the PDF if you want it debugged

galkahana avatar Aug 22 '17 13:08 galkahana

@Jackychans tempDoc.pdf has got a header which is not PDF. remove all the part up to %PDF-1.7 (not including) and the file should parse fine.

galkahana avatar Aug 25 '17 16:08 galkahana

Thanks @galkahana ga for response although it's not pretty fast, hehe.

I had found it wrong in the header of the file just after posting issue here.

Again, thanks

Jackychans avatar Aug 25 '17 16:08 Jackychans

@Jackychans tempDoc.pdf has got a header which is not PDF. remove all the part up to %PDF-1.7 (not including) and the file should parse fine.

You say the header is not PDF, however any PDF reader will open the file normally. So i would assume the lib show either ignore the thinks it doesn't "care" or replace them, as it is nearly impossible to predict what will come inside the file that hummus doesn't want, considering a file that works everywhere else.

Let's say i go to google docs and generate a file, and it comes with something on its header. It will open anywhere, but my program, because hummus does not support it somehow.

zerobytes avatar Apr 12 '20 08:04 zerobytes

same, i got this error with 4 different pdf...

SolidTears avatar Sep 08 '20 05:09 SolidTears

I have had this error with every pdf ive tested with and they all have properly formatted headers, I think something is wrong with the currently released version of hummus

untrustedlifeswanleap avatar Sep 22 '20 16:09 untrustedlifeswanleap

Hummus is declining some PDFs as they're not according to PDF standards. Check your PDF here -> https://www.pdfen.com/pdf-a-validator We might have to convert PDF according to standard in catch block if we receive the same parsing error from Hummus.

FranklinThaker avatar Aug 26 '21 11:08 FranklinThaker

Finally, I've created a solution here. https://stackoverflow.com/questions/69039978/hummus-recipe-npm-typeerror-unable-to-start-parsing-pdf-file/69040034#69040034

FranklinThaker avatar Sep 03 '21 05:09 FranklinThaker