content-extractor icon indicating copy to clipboard operation
content-extractor copied to clipboard

pdfminer has changed it's API and broken some links

Open Gijs-Koot opened this issue 11 years ago • 1 comments

The euske / pdfminer repository has changed the location of the PDFDocument class, as noted in the README. This class can be refound easily, but also other things have changed as can be deducted from the following error message. I will not pursue this any further and use pdfminer directly.

[..]/pdfsplitter/content_extractor/pdfreader/util/convert.py in <module>()
        2 from pdfminer.pdfparser import PDFParser
        3 from pdfminer.pdfdocument import PDFDocument
----> 4 from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter, process_pdf
        5 from pdfminer.pdfdevice import PDFDevice, TagExtractor
        6 from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter

ImportError: cannot import name process_pdf

Gijs-Koot avatar Jan 07 '14 18:01 Gijs-Koot

https://github.com/Micka33/content-extractor/pull/2 should fix this problem

thuutin avatar Nov 15 '15 16:11 thuutin