pdfminer.six icon indicating copy to clipboard operation
pdfminer.six copied to clipboard

Community maintained fork of pdfminer - we fathom PDF

Results 302 pdfminer.six issues
Sort by recently updated
recently updated
newest added

The command: `pdf2txt.py-3 -t tag PDFFILE.pdf` Always gives the following error: AttributeError: 'dict' object has no attribute 'iteritems' PDF test file attached: [20200213CAPDJETJRJ_298.pdf](https://github.com/pdfminer/pdfminer.six/files/4199179/20200213CAPDJETJRJ_298.pdf) Using python 3.6 on CentOS 7. Full...

status: needs solution

First, thank you for this amazing library. I currently have an issue with a particular pdf file. I have a pdf with both English and Tamil text, for the page...

component: converter
type:performance
status: needs solution

These should IMHO be made available only as an option and in a separate wheel... The current wheel is ~ 6MB and the almost all of it except 100KB is...

type:performance
component:characters
status: needs solution
type: security

Hi, I've been working with some pdf's in PDFminer that when processed cause very high memory usage, this seems to be due to the amount of objects created by analyzing...

type:performance
status: accepted

**Feature request** Would it be possible to install the two console scripts `pdf2txt.py` and `dumppdf.py` without their `.py` file extension? Most Python packages, that I'm aware of, provide their console...

type: ux
status: accepted

I have been troubleshooting a significant performance issue using PDFMiner to extract text from certain utility bills. While investigating, I attempted to use cProfile on pdf2txt.py to see what was...

type: bug
status: accepted

**Bug report** How to use section says to run the script [like this](https://github.com/pdfminer/pdfminer.six/blob/develop/README.md#how-to-use): `python pdf2txt.py ...`. However after installing it through pip, that doesn't work: ``` $ python pdf2txt.py /usr/local/Cellar/[email protected]/3.9.6/libexec/bin/python:...

status: accepted

**Bug report** When I use `pdf2txt` on a specific PDF file, I get some sentences printed out three times: ``` high volatility of their newly traded tokens. By immediately allowing...

type: bug
status: accepted

**Bug report** ## A description of the bug Somewhere in `pdfminer.six` there appears to be a memory leak; even after a Python process is done handling a PDF, the memory...

status: needs solution

- A description of the feature you would like to have input from standard input (stdin) - If relevant, the context that you are in. What are you trying to...

type: new feature
status: accepted