wordfreq
wordfreq copied to clipboard
UnicodeDecodeError when importing module
When trying to import wordfreq via import wordfreq I get the following error:
File "<stdin>", line 1, in <module>
File "/home/lukasm/Applications/miniforge3/envs/rocm/lib/python3.12/site-packages/wordfreq/__init__.py", line 17, in <module>
from wordfreq.tokens import lossy_tokenize, simple_tokenize, tokenize
File "/home/lukasm/Applications/miniforge3/envs/rocm/lib/python3.12/site-packages/wordfreq/tokens.py", line 8, in <module>
from ftfy.fixes import uncurl_quotes
File "/home/lukasm/Applications/miniforge3/envs/rocm/lib/python3.12/site-packages/ftfy/__init__.py", line 23, in <module>
from ftfy import bad_codecs, chardata, fixes
File "/home/lukasm/Applications/miniforge3/envs/rocm/lib/python3.12/site-packages/ftfy/chardata.py", line 59, in <module>
ENCODING_REGEXES = _build_regexes()
^^^^^^^^^^^^^^^^
File "/home/lukasm/Applications/miniforge3/envs/rocm/lib/python3.12/site-packages/ftfy/chardata.py", line 47, in _build_regexes
charlist = byte_range.decode(encoding)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lukasm/Applications/miniforge3/envs/rocm/lib/python3.12/site-packages/future_typing/codec.py", line 23, in decode
first_line = lines[0].decode("utf-8", errors)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
decoding with 'sloppy-windows-1252' codec failed
I'm running Python 3.12 and have install wordfreq 3.1.1 via pip.
It imports on Python 3.12 for me. If you update the ftfy dependency to version 6.3, does it work for you?
I would like to update that dependency. Apparently I need to set up a new PyPI publishing workflow to publish a new version, which is going to take some time.
Unfortunately that does not seem to work. I tried ftfy 6.3.1, 6.3.0, 6.2.0 and 6.1.0.
Okay, the problem was resolved by removing the future_typing package.