content-extractor issues

Results 3 content-extractor issues

Sort by recently updated

IndexError: list index out of range

I encounter this problem, though the examples are processed successfully. File "general.py", line 12, in json = main.run("Programming with PDFMiner.pdf", "./images/") File "D:\codegit\python2.7\pdfminer\content-extractor-master\pdfreader\main.py", line 82, in run dict_book = text_to_dict(pdf_file)...

aristotll

Handle non-ASCII documents and JSON outputs

I'm not sure how the original code could handle UTF-8 input files. Buffering characters in Unicode ensured I could convert mine, producing UTF-8 JSON output (io.cStringIO does accept Unicode while...

ymollard

pdfminer has changed it's API and broken some links

The euske / pdfminer repository has changed the location of the PDFDocument class, as noted in the README. This class can be refound easily, but also other things have changed...

Gijs-Koot

content-extractor
content-extractor copied to clipboard

Metadata

IndexError: list index out of range

Handle non-ASCII documents and JSON outputs

pdfminer has changed it's API and broken some links

← Metadata

Owner

Metadata

content-extractor content-extractor copied to clipboard

Metadata

IndexError: list index out of range

Handle non-ASCII documents and JSON outputs

pdfminer has changed it's API and broken some links

← Metadata

Owner

Metadata

content-extractor
content-extractor copied to clipboard