polyglot
polyglot copied to clipboard
UnicodeDecodeError during installation (PIP, Python 3.5, Windows 10)
Hello!
I have the following error while trying to install the latest version of polyglot via pip:
raceback (most recent call last): File "<string>", line 1, in <module> File "C:\Users\Aleksey\AppData\Local\Temp\pycharm-packaging\polyglot\setup.py", line 15, in <module> readme = readme_file.read() File "C:\Program Files\Python35\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 4941: character maps to <undefined>
As I found from the web search - this error may arise because of the current assembly in PIP doesn't have Windows support. (see here and here).
If you could fix this - it will be great, and more convenient to use polyglot :) Thank you!
The longer I try to make it work on Windows (even when it was installed, using it on Windows is a complete disaster), the more I am convinced that it is intended only for Linux.
Hello @alex2304 it may be true that it is more tested on linux, but for me it's quite strange that "readme_file.read()" tries to use cp1252 to read the file. The standard python configuration should always use utf-8. Where does your python installation come from ? However if you're not capable of understanding this kind of issue I'm not sure polyglot is mature enough for you (from my own experience, it still have some rough edges).
@alexgarel Python opens files using the encoding returned by locale.getpreferredencoding()
, which is 'cp1252' in Windows. So, for cross-platform compatibility, you should explicitly specify 'utf-8' when reading files.
It's annoying, I agree. But explicit is better than implicit anyway.
This seems to be fixed in the master branch.
To install polyglot in Windows using a Python 3.6 or Python 3.7 you will need a wheel for two dependencies:
You need to download them and then install them with pip from your local machine.
Here you will found many unofficial python builds: https://www.lfd.uci.edu/~gohlke/pythonlibs/
- install PyICU.whl
- install PyCLD2.whl
In both cases you will need be able to choose the right version of the build for your windows version and your python version.
It's easy, for example for PyICU:
PyICU wraps the ICU (International Components for Unicode) library.
PyICU‑2.3.1‑cp27‑cp27m‑win32.whl PyICU‑2.3.1‑cp27‑cp27m‑win_amd64.whl PyICU‑2.3.1‑cp35‑cp35m‑win32.whl PyICU‑2.3.1‑cp35‑cp35m‑win_amd64.whl PyICU‑2.3.1‑cp36‑cp36m‑win32.whl PyICU‑2.3.1‑cp36‑cp36m‑win_amd64.whl PyICU‑2.3.1‑cp37‑cp37m‑win32.whl PyICU‑2.3.1‑cp37‑cp37m‑win_amd64.whl
the 27 means Python 2.7 and the 36 Python 3.6... If you have 64 bits python and windows then choose the amd64 otherwhise the win32 version.
Once you have download them you will need to install it using pip in your python environment:
In my case: `python -m pip install C:\Users\Administrator\Downloads\pycld2-0.31-cp37-cp37m-win_amd64.whl python -m pip install C:\Users\Administrator\Downloads\PyICU-2.3.1-cp37-cp37m-win_amd64.whl
pip install git+https://github.com/aboSamoor/polyglot@master `
Had the same issue. Thanks to @RNogales94's comments, the issue is solved on my Windows 7, Python3.6, 64bit environment. It deserves to be in the Installation section of the documentation I think.
I had the same issue. Using @RNogales94's solution works well. But with the update of pycld2 to version 0.4 appears a similar issue. Using python3.6
Collecting pycld2>=0.3 (from polyglot==16.7.4->-r /requirements/base.txt (line 19))
Downloading https://files.pythonhosted.org/packages/19/8e/6427a3dd5f2605fbc2a41327400b4a86fc626e12fc6e593bf3cf5fd1863b/pycld2-0.40.tar.gz (41.4MB)
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-le662rz7/pycld2/setup.py", line 98, in <module>
long_description=open(path.join(HERE, "README.md")).read(),
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 1565: ordinal not in range(128)
Can you check it please?
Thanks
I had the same issue. Using @RNogales94's solution works well. But with the update of pycld2 to version 0.4 appears a similar issue. Using python3.6
Collecting pycld2>=0.3 (from polyglot==16.7.4->-r /requirements/base.txt (line 19)) Downloading https://files.pythonhosted.org/packages/19/8e/6427a3dd5f2605fbc2a41327400b4a86fc626e12fc6e593bf3cf5fd1863b/pycld2-0.40.tar.gz (41.4MB) Complete output from command python setup.py egg_info: Traceback (most recent call last): File "<string>", line 1, in <module> File "/tmp/pip-install-le662rz7/pycld2/setup.py", line 98, in <module> long_description=open(path.join(HERE, "README.md")).read(), File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 1565: ordinal not in range(128)
Can you check it please?
Thanks
Hi @quimdt,
Same procedure with the updated file "pycld2-0.40-cp36-cp36m-win_amd64.whl" installs and works (successfully fetches language data) without a problem on my machine (Win7, x64, Python3.6).
python -m pip install pycld2-0.40-cp36-cp36m-win_amd64.whl Processing pycld2-0.40-cp36-cp36m-win_amd64.whl Installing collected packages: pycld2 Found existing installation: pycld2 0.31 Uninstalling pycld2-0.31: Successfully uninstalled pycld2-0.31 Successfully installed pycld2-0.40
Thanks @gokhanercan now is working.
I was encountering the same errors even with @/RNogales94 solution. Found out that the latest pip version (23.2.1) wasn't supporting it. So downgrading the pip version to 21.3.1 installed polyglot successfully -
pip install pip==21.3.1