pdfparanoia
pdfparanoia copied to clipboard
pdfminer API has changed
If you run with latest pdfminer, pdfparanoia bombs out with
Traceback (most recent call last):
File "/bin/pdfparanoia", line 38, in <module>
outputcontent = pdfparanoia.scrub(StringIO(Args.in_pdf.read()), verbose=verbose)
File "/usr/lib/python2.7/site-packages/pdfparanoia/core.py", line 53, in scrub
content = plugin.scrub(content, verbose=verbose)
File "/usr/lib/python2.7/site-packages/pdfparanoia/plugins/aip.py", line 25, in scrub
pdf = parse_content(content)
File "/usr/lib/python2.7/site-packages/pdfparanoia/parser.py", line 46, in parse_content
return parse_pdf(stream)
File "/usr/lib/python2.7/site-packages/pdfparanoia/parser.py", line 31, in parse_pdf
doc = pdfminer.pdfparser.PDFDocument()
AttributeError: 'module' object has no attribute 'PDFDocument'
As suggested in https://github.com/timClicks/slate/issues/5 you can work around this by using an old pdfminer by
pip install --upgrade --ignore-installed slate==0.3 pdfminer==20110515