pdfminer.six
                                
                                
                                
                                    pdfminer.six copied to clipboard
                            
                            
                            
                        Community maintained fork of pdfminer - we fathom PDF
The command: `pdf2txt.py-3 -t tag PDFFILE.pdf` Always gives the following error: AttributeError: 'dict' object has no attribute 'iteritems' PDF test file attached: [20200213CAPDJETJRJ_298.pdf](https://github.com/pdfminer/pdfminer.six/files/4199179/20200213CAPDJETJRJ_298.pdf) Using python 3.6 on CentOS 7. Full...
First, thank you for this amazing library. I currently have an issue with a particular pdf file. I have a pdf with both English and Tamil text, for the page...
These should IMHO be made available only as an option and in a separate wheel... The current wheel is ~ 6MB and the almost all of it except 100KB is...
Hi, I've been working with some pdf's in PDFminer that when processed cause very high memory usage, this seems to be due to the amount of objects created by analyzing...
**Feature request** Would it be possible to install the two console scripts `pdf2txt.py` and `dumppdf.py` without their `.py` file extension? Most Python packages, that I'm aware of, provide their console...
I have been troubleshooting a significant performance issue using PDFMiner to extract text from certain utility bills. While investigating, I attempted to use cProfile on pdf2txt.py to see what was...
**Bug report** How to use section says to run the script [like this](https://github.com/pdfminer/pdfminer.six/blob/develop/README.md#how-to-use): `python pdf2txt.py ...`. However after installing it through pip, that doesn't work: ``` $ python pdf2txt.py /usr/local/Cellar/[email protected]/3.9.6/libexec/bin/python:...
**Bug report** When I use `pdf2txt` on a specific PDF file, I get some sentences printed out three times: ``` high volatility of their newly traded tokens. By immediately allowing...
Memory leak?
**Bug report** ## A description of the bug Somewhere in `pdfminer.six` there appears to be a memory leak; even after a Python process is done handling a PDF, the memory...
- A description of the feature you would like to have input from standard input (stdin) - If relevant, the context that you are in. What are you trying to...