cChardet icon indicating copy to clipboard operation
cChardet copied to clipboard

Support Python 3.10

Open decaz opened this issue 4 years ago • 17 comments

... and prepare for Python 3.11 (dev).

decaz avatar Oct 05 '21 23:10 decaz

Any news on making this library Python 3.10 supported?

spenpal avatar Mar 17 '22 19:03 spenpal

JFYI https://github.com/PyYoshi/cChardet/pull/78

oleksandr-kuzmenko avatar Mar 27 '22 01:03 oleksandr-kuzmenko

After 6 months, will that PR get merged or even looked at

ooliver1 avatar May 04 '22 21:05 ooliver1

Future updates like python 3.10 support coming or the project is dropped?

mikBighne98 avatar May 29 '22 18:05 mikBighne98

it seems as though cchardet has been abandoned, yet it is depended on by many large projects

ooliver1 avatar May 29 '22 19:05 ooliver1

Are there any good alternatives to cchardet? If the repo is not getting any love then quite a few projects will need a replacement.

NebularNerd avatar Jun 21 '22 18:06 NebularNerd

^, it is possible to install gcc but this wont be the biggest issue forever, 3.11/12 may somehow break this

ooliver1 avatar Jun 21 '22 18:06 ooliver1

^, it is possible to install gcc but this wont be the biggest issue forever, 3.11/12 may somehow break this

I assume this is for *nix users, I'm on Windows and it keeps throwing up the 'C++ 14 Required' error when I try to install. I assume because for Windows it's trying to compile using C++ instead of gcc

Does anyone know if I can manually compile this on Windows using my MinGW gcc install? I'd rather not download multi GB's of Visual Studio just for one python package.

NebularNerd avatar Jun 22 '22 09:06 NebularNerd

^, it is possible to install gcc but this wont be the biggest issue forever, 3.11/12 may somehow break this

I assume this is for *nix users, I'm on Windows and it keeps throwing up the 'C++ 14 Required' error when I try to install. I assume because for Windows it's trying to compile using C++ instead of gcc

Does anyone know if I can manually compile this on Windows using my MinGW gcc install? I'd rather not download multi GB's of Visual Studio just for one python package.

@NebularNerd yes this was *nix, here are steps i found for windows using mingw

  • add C:\MinGW\bin to PATH
  • edit PYTHONPATH\Lib\distutils with a distutils.cfg file containing
[build]
compiler=mingw32

https://stackoverflow.com/a/5051281

ooliver1 avatar Jun 22 '22 16:06 ooliver1

I ended up here when I was upgrading the project's python version and started hitting up against errors involving this package in pip.

Are there any good alternatives to cchardet? If the repo is not getting any love then quite a few projects will need a replacement.

It depends on what you're trying to do. There's an MIT licensed package called charset_normalizer many seem to have switched to.

charset_normalizer focuses on providing you the actual text content in usable, unicode form.

Whereas, it seems like cchardet focuses on trying to tell you what a text file is encoded in. In a project I'm working on, this detected encoding is attempted to be used with an open().

charset_normalizer is like, "why bother with determining the exact encoding scheme?"

Instead it figures out the most likely original encoding scheme to result in successful decoding and encoding to useable text content.

If you look, it is specifically compared with this package and calls out this package, cChardet's apparent use of a cpp binding. It also claims it has higher accuracy but possibly less speed.

banagale avatar Jun 23 '22 06:06 banagale

Thanks @ooliver1 and @banagale for your replies. I'm going to take a good look at charset_normalizer as anyone having to install gcc just to compile cChardet for my small Subtotxt script seems a trifle excessive.

In the meantime I'll compile it with gcc as an interim bodge.

NebularNerd avatar Jun 23 '22 09:06 NebularNerd

It was working for me on Python 3.10, but now fails to install on Python 3.11:

/usr/bin/clang -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -pipe -Os -isysroot/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -Isrc/ext/uchardet/src -I/opt/local/Library/Frameworks/Python.framework/Versions/3.11/include/python3.11 -c src/cchardet/_cchardet.cpp -o build/temp.macosx-12.0-x86_64-cpython-311/src/cchardet/_cchardet.o
src/cchardet/_cchardet.cpp:196:12: fatal error: 'longintrepr.h' file not found
  #include "longintrepr.h"
           ^~~~~~~~~~~~~~~
1 error generated.
error: command '/usr/bin/clang' failed with exit code 1

EDIT: Manually installing cython beforehand seems to fix the issue (possibly related to cython/cython#4461).

mohd-akram avatar Oct 26 '22 07:10 mohd-akram

There are 2 PRs #78 and #80 that will address this. @PyYoshi, can you merge and release a new build, please?

SimplicityGuy avatar Nov 05 '22 17:11 SimplicityGuy

There are 2 PRs #78 and #80 that will address this. @PyYoshi, can you merge and release a new build, please?

It is pretty established they have abandoned cchardet, see the PRs you referenced, #78 is nearly 1 year old.

ooliver1 avatar Nov 05 '22 22:11 ooliver1

It is pretty established they have abandoned cchardet, see the PRs you referenced, #78 is nearly 1 year old.

Indeed. It's unfortunate since right now many downstream dependencies can't be completely installed with Python 3.11 due to build issues.

SimplicityGuy avatar Nov 05 '22 23:11 SimplicityGuy

At this stage it's come down to either moving to charset_normalizer or if someone is willing to, fork this and make cchardet-ng or similar.

NebularNerd avatar Nov 06 '22 09:11 NebularNerd

Might want to take a look at this: https://github.com/faust-streaming/cChardet

pip install faust-cchardet

I support Python 3.10+3.11 now, so we're good. I'll open a PR so that some day if @PyYoshi comes back to this project, he can update this.

wbarnha avatar Dec 12 '22 15:12 wbarnha