pdfminer.six
pdfminer.six copied to clipboard
It seems that the project is not compatible with 'pdfminer'.
I have installed 'pdfminer.six' in my local venv. there is 'pdfminer' in the dependencies of another project I installed named 'docassemble-base'. The later installation of 'pdfminer' seems to have overwritten some modules of the same name in 'pdfminer.six' by default, the following is a list of some of the overwritten modules:
pdfminer/glyphlist.py
pdfminer/pdfpage.py
pdfminer/latin_enc.py
pdfminer/pdftypes.py
pdfminer/pdfdevice.py
pdfminer/ccitt.py
pdfminer/utils.py
pdfminer/pdffont.py
pdfminer/runlength.py
pdfminer/pdfinterp.py
pdfminer/encodingdb.py
pdfminer/image.py
pdfminer/layout.py
pdfminer/pdfparser.py
pdfminer/arcfour.py
pdfminer/pdfcolor.py
pdfminer/pdfdocument.py
pdfminer/lzw.py
pdfminer/psparser.py
pdfminer/converter.py
pdfminer/cmapdb.py
pdfminer/fontmetrics.py
pdfminer/ascii85.py
This behavior caused damage to my virtual environment, and some previously working projects reported errors when running.
This might be the same issue. Unfortunately knowing nothing about python I can only help debug by typing in whatever commands you tell me to. I followed the instructions on your main page and got the error 'No module named 'pdfminer'.'
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ python3 -v < /dev/null|&grep ^Python
Python 3.6.9 (default, Dec 8 2021, 21:08:43)
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ pip install pdfminer.six
Collecting pdfminer.six
Using cached https://files.pythonhosted.org/packages/cb/83/200b2723bcbf1d1248a8a7d16e6dd6cb970b5331397b11948428d7ebcf37/pdfminer.six-20191110-py2.py3-none-any.whl
Collecting six (from pdfminer.six)
Using cached https://files.pythonhosted.org/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl
Collecting sortedcontainers (from pdfminer.six)
Using cached https://files.pythonhosted.org/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl
Collecting pycryptodome (from pdfminer.six)
Installing collected packages: six, sortedcontainers, pycryptodome, pdfminer.six
Successfully installed pdfminer.six-20191110 pycryptodome-3.17 six-1.16.0 sortedcontainers-2.4.0
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ python3 pdf2txt.py ../../Robertson1981.pdf
Traceback (most recent call last):
File "pdf2txt.py", line 9, in <module>
import pdfminer.high_level
ModuleNotFoundError: No module named 'pdfminer'
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ ./pdf2txt.py ../../Robertson1981.pdf
Traceback (most recent call last):
File "./pdf2txt.py", line 9, in <module>
import pdfminer.high_level
ModuleNotFoundError: No module named 'pdfminer'
This might be the same issue. Unfortunately knowing nothing about python I can only help debug by typing in whatever commands you tell me to. I followed the instructions on your main page and got the error 'No module named 'pdfminer'.'
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ python3 -v < /dev/null|&grep ^Python Python 3.6.9 (default, Dec 8 2021, 21:08:43) gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ pip install pdfminer.six Collecting pdfminer.six Using cached https://files.pythonhosted.org/packages/cb/83/200b2723bcbf1d1248a8a7d16e6dd6cb970b5331397b11948428d7ebcf37/pdfminer.six-20191110-py2.py3-none-any.whl Collecting six (from pdfminer.six) Using cached https://files.pythonhosted.org/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl Collecting sortedcontainers (from pdfminer.six) Using cached https://files.pythonhosted.org/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl Collecting pycryptodome (from pdfminer.six) Installing collected packages: six, sortedcontainers, pycryptodome, pdfminer.six Successfully installed pdfminer.six-20191110 pycryptodome-3.17 six-1.16.0 sortedcontainers-2.4.0 gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ python3 pdf2txt.py ../../Robertson1981.pdf Traceback (most recent call last): File "pdf2txt.py", line 9, in <module> import pdfminer.high_level ModuleNotFoundError: No module named 'pdfminer' gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ ./pdf2txt.py ../../Robertson1981.pdf Traceback (most recent call last): File "./pdf2txt.py", line 9, in <module> import pdfminer.high_level ModuleNotFoundError: No module named 'pdfminer'
In fact, I found that this is due to pip's code logic. Since I installed pdfminer
locally first, the contents of the pdfminer/__init__.py
file in the local environment are as follows:
(testpip) ➜ testpip cat lib/python3.10/site-packages/pdfminer/__init__.py
#!/usr/bin/env python
__version__ = '20191125'
if __name__ == '__main__':
print(__version__)
Then I installed pdfminer.six
, pip will overwrite some files defaultly without warning. The content of the pdfminer/__init__.py
changed as well.
(testpip) ➜ testpip pip install pdfminer.six -i https://pypi.tuna.tsinghua.edu.cn/simple
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting pdfminer.six
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/46/68/b3fb5f073bcd3df9143a3520289c147351bfa3c1b096d44081f38fd1c247/pdfminer.six-20221105-py3-none-any.whl (5.6 MB)
Requirement already satisfied: charset-normalizer>=2.0.0 in ./lib/python3.10/site-packages (from pdfminer.six) (3.1.0)
Requirement already satisfied: cryptography>=36.0.0 in ./lib/python3.10/site-packages (from pdfminer.six) (40.0.2)
Requirement already satisfied: cffi>=1.12 in ./lib/python3.10/site-packages (from cryptography>=36.0.0->pdfminer.six) (1.15.1)
Requirement already satisfied: pycparser in ./lib/python3.10/site-packages (from cffi>=1.12->cryptography>=36.0.0->pdfminer.six) (2.21)
Installing collected packages: pdfminer.six
Successfully installed pdfminer.six-20221105
[notice] A new release of pip available: 22.2.2 -> 23.1.2
[notice] To update, run: pip install --upgrade pip
(testpip) ➜ testpip cat lib/python3.10/site-packages/pdfminer/__init__.py
__version__ = "20221105" # auto replaced with tag in github actions
if __name__ == "__main__":
print(__version__)