pdfminer.six icon indicating copy to clipboard operation
pdfminer.six copied to clipboard

It seems that the project is not compatible with 'pdfminer'.

Open unsatisfying opened this issue 1 year ago • 2 comments

I have installed 'pdfminer.six' in my local venv. there is 'pdfminer' in the dependencies of another project I installed named 'docassemble-base'. The later installation of 'pdfminer' seems to have overwritten some modules of the same name in 'pdfminer.six' by default, the following is a list of some of the overwritten modules:

pdfminer/glyphlist.py
pdfminer/pdfpage.py
pdfminer/latin_enc.py
pdfminer/pdftypes.py
pdfminer/pdfdevice.py
pdfminer/ccitt.py
pdfminer/utils.py
pdfminer/pdffont.py
pdfminer/runlength.py
pdfminer/pdfinterp.py
pdfminer/encodingdb.py
pdfminer/image.py
pdfminer/layout.py
pdfminer/pdfparser.py
pdfminer/arcfour.py
pdfminer/pdfcolor.py
pdfminer/pdfdocument.py
pdfminer/lzw.py
pdfminer/psparser.py
pdfminer/converter.py
pdfminer/cmapdb.py
pdfminer/fontmetrics.py
pdfminer/ascii85.py

This behavior caused damage to my virtual environment, and some previously working projects reported errors when running.

unsatisfying avatar Apr 04 '23 09:04 unsatisfying

This might be the same issue. Unfortunately knowing nothing about python I can only help debug by typing in whatever commands you tell me to. I followed the instructions on your main page and got the error 'No module named 'pdfminer'.'

gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ python3 -v < /dev/null|&grep ^Python
Python 3.6.9 (default, Dec  8 2021, 21:08:43)
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ pip install pdfminer.six
Collecting pdfminer.six
  Using cached https://files.pythonhosted.org/packages/cb/83/200b2723bcbf1d1248a8a7d16e6dd6cb970b5331397b11948428d7ebcf37/pdfminer.six-20191110-py2.py3-none-any.whl
Collecting six (from pdfminer.six)
  Using cached https://files.pythonhosted.org/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl
Collecting sortedcontainers (from pdfminer.six)
  Using cached https://files.pythonhosted.org/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl
Collecting pycryptodome (from pdfminer.six)
Installing collected packages: six, sortedcontainers, pycryptodome, pdfminer.six
Successfully installed pdfminer.six-20191110 pycryptodome-3.17 six-1.16.0 sortedcontainers-2.4.0
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ python3 pdf2txt.py ../../Robertson1981.pdf
Traceback (most recent call last):
  File "pdf2txt.py", line 9, in <module>
    import pdfminer.high_level
ModuleNotFoundError: No module named 'pdfminer'
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ ./pdf2txt.py ../../Robertson1981.pdf
Traceback (most recent call last):
  File "./pdf2txt.py", line 9, in <module>
    import pdfminer.high_level
ModuleNotFoundError: No module named 'pdfminer'

gtoal avatar May 01 '23 22:05 gtoal

This might be the same issue. Unfortunately knowing nothing about python I can only help debug by typing in whatever commands you tell me to. I followed the instructions on your main page and got the error 'No module named 'pdfminer'.'

gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ python3 -v < /dev/null|&grep ^Python
Python 3.6.9 (default, Dec  8 2021, 21:08:43)
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ pip install pdfminer.six
Collecting pdfminer.six
  Using cached https://files.pythonhosted.org/packages/cb/83/200b2723bcbf1d1248a8a7d16e6dd6cb970b5331397b11948428d7ebcf37/pdfminer.six-20191110-py2.py3-none-any.whl
Collecting six (from pdfminer.six)
  Using cached https://files.pythonhosted.org/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl
Collecting sortedcontainers (from pdfminer.six)
  Using cached https://files.pythonhosted.org/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl
Collecting pycryptodome (from pdfminer.six)
Installing collected packages: six, sortedcontainers, pycryptodome, pdfminer.six
Successfully installed pdfminer.six-20191110 pycryptodome-3.17 six-1.16.0 sortedcontainers-2.4.0
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ python3 pdf2txt.py ../../Robertson1981.pdf
Traceback (most recent call last):
  File "pdf2txt.py", line 9, in <module>
    import pdfminer.high_level
ModuleNotFoundError: No module named 'pdfminer'
gtoal@linux:~/src/pdf_to_text/pdfminer/tools$ ./pdf2txt.py ../../Robertson1981.pdf
Traceback (most recent call last):
  File "./pdf2txt.py", line 9, in <module>
    import pdfminer.high_level
ModuleNotFoundError: No module named 'pdfminer'

In fact, I found that this is due to pip's code logic. Since I installed pdfminer locally first, the contents of the pdfminer/__init__.py file in the local environment are as follows:

(testpip) ➜  testpip cat lib/python3.10/site-packages/pdfminer/__init__.py
#!/usr/bin/env python
__version__ = '20191125'

if __name__ == '__main__':
    print(__version__)

Then I installed pdfminer.six, pip will overwrite some files defaultly without warning. The content of the pdfminer/__init__.py changed as well.

(testpip) ➜  testpip pip install pdfminer.six -i https://pypi.tuna.tsinghua.edu.cn/simple
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting pdfminer.six
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/46/68/b3fb5f073bcd3df9143a3520289c147351bfa3c1b096d44081f38fd1c247/pdfminer.six-20221105-py3-none-any.whl (5.6 MB)
Requirement already satisfied: charset-normalizer>=2.0.0 in ./lib/python3.10/site-packages (from pdfminer.six) (3.1.0)
Requirement already satisfied: cryptography>=36.0.0 in ./lib/python3.10/site-packages (from pdfminer.six) (40.0.2)
Requirement already satisfied: cffi>=1.12 in ./lib/python3.10/site-packages (from cryptography>=36.0.0->pdfminer.six) (1.15.1)
Requirement already satisfied: pycparser in ./lib/python3.10/site-packages (from cffi>=1.12->cryptography>=36.0.0->pdfminer.six) (2.21)
Installing collected packages: pdfminer.six
Successfully installed pdfminer.six-20221105

[notice] A new release of pip available: 22.2.2 -> 23.1.2
[notice] To update, run: pip install --upgrade pip
(testpip) ➜  testpip cat lib/python3.10/site-packages/pdfminer/__init__.py
__version__ = "20221105"  # auto replaced with tag in github actions

if __name__ == "__main__":
    print(__version__)

unsatisfying avatar May 02 '23 03:05 unsatisfying