pdfminer icon indicating copy to clipboard operation
pdfminer copied to clipboard

Python PDF Parser (Not actively maintained). Check out pdfminer.six.

Results 100 pdfminer issues
Sort by recently updated
recently updated
newest added

Hi! Since python 2 is dead and both projects now have the same goal, do they still need to be independent? see #210 and #243

Code: ``` from pdfminer.converter import XMLConverter rsrcmgr = PDFResourceManager() xmlstream = StringIO() device = XMLConverter(rsrcmgr, xmlstream) # xmlstream now contains something like: # '' ``` While that encoding is pretty...

First of all, thanks for this great tool for parsing PDFs. I am facing issues when extracting text from two column text pages in PDF (research paper). In such cases,...

`pip install pdfminer` Error Running setup.py install for pdfminer ... error Command "/usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-eyesmn/pdfminer/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-MuNlNG-record/install-record.txt --single-version-externally-managed --compile" failed with...

Hi. I get an error when process page in some PDF files. Code: ``` fp = open(filename, 'rb') # Create a PDF parser object associated with the file object. parser...

Given a pdf file, how to judge whether it is a native pdf or a scanned pdf by using `pdfminer`, any suggestions?

`pdffonts` can collect all fonts used in a pdf file, e.g. [Link](https://stackoverflow.com/questions/11820241/) ```bash pdffonts bash-manpage.pdf name type encoding emb sub uni object ID ------------------------------- ------------- --------------- --- --- --- ---------...

``` $ git clone [email protected]:euske/pdfminer.git $ cd pdfminer $ python3 ./tools/dumppdf.py Traceback (most recent call last): File "./tools/dumppdf.py", line 17, in from pdfminer.utils import isnumber, q ImportError: cannot import name...