content-extractor pdfminer has changed it's API and broken some links

pdfminer has changed it's API and broken some links

Open Gijs-Koot opened this issue 11 years ago • 1 comments

The euske / pdfminer repository has changed the location of the PDFDocument class, as noted in the README. This class can be refound easily, but also other things have changed as can be deducted from the following error message. I will not pursue this any further and use pdfminer directly.

[..]/pdfsplitter/content_extractor/pdfreader/util/convert.py in <module>()
        2 from pdfminer.pdfparser import PDFParser
        3 from pdfminer.pdfdocument import PDFDocument
----> 4 from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter, process_pdf
        5 from pdfminer.pdfdevice import PDFDevice, TagExtractor
        6 from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter

ImportError: cannot import name process_pdf

Jan 07 '14 18:01 Gijs-Koot

https://github.com/Micka33/content-extractor/pull/2 should fix this problem

Nov 15 '15 16:11 thuutin

content-extractor content-extractor copied to clipboard

pdfminer has changed it's API and broken some links

content-extractor
content-extractor copied to clipboard