pdfocr icon indicating copy to clipboard operation
pdfocr copied to clipboard

pdftk error: Unexpected Exception in open_reader()

Open shivams opened this issue 9 years ago • 3 comments

For some PDF files, pdftk throws this error:

Error: Unexpected Exception in open_reader()
Unhandled Java Exception:

This bug has been reported on pdftk launchpad: https://bugs.launchpad.net/ubuntu/+source/pdftk/+bug/774052

It seems like the bug hasn't been fixed. Due to this bug, pdfocr.rb also fails on many occasions. However, there is a temporary solution that I have. The solution is something like this:

Sometimes, pdftk completely fails to read certain types of PDFs. However, if we read those PDFs using some other tool and then recreate them, then pdftk will read the newly created PDF just fine. E.g. we can use ghostscript to recreate pdf like this:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=newfile.pdf myfile.pdf

Now pdftk will read the newly created PDF file just fine.

If someone is willing to apply this solution, then it'd be really good. Otherwise I will make the changes myself and send a pull request.

PS: A sample file which fails to be read is given here: https://www.jstage.jst.go.jp/article/jsmec/45/3/45_3_730/_pdf

shivams avatar May 01 '15 10:05 shivams

I met similar error under Windows environment if the path of PDF file contained "Non-Latin characters", such as Chinese. But if I move the PDF file to the path without Chinese, it works.

mcdlee avatar Aug 17 '15 02:08 mcdlee

I met similar error under Windows environment if the path of PDF file contained "Non-Latin characters", such as Chinese. But if I move the PDF file to the path without Chinese, it works.

Thanks! That is a very useful comment. The path I had problem with had whitespace. I moved the files some other path that doesn't have whitespace.

ahmad-elkomey avatar Jan 31 '20 16:01 ahmad-elkomey

I met similar error under Windows environment if the path of PDF file contained "Non-Latin characters", such as Chinese. But if I move the PDF file to the path without Chinese, it works.

When I changed the path, I could also combine my files. Thank you!

mkyildiz01 avatar Apr 26 '21 18:04 mkyildiz01