doc2vec 'ascii' codec can't decode byte 0xf7 in position 0: ordinal not in range(128)

'ascii' codec can't decode byte 0xf7 in position 0: ordinal not in range(128)

Open TobiasEl opened this issue 6 years ago • 3 comments

Hi, I'm using the gensim forked version, but when I'm loading the model I have this error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xf7 in position 0: ordinal not in range(128) I try to encode the name of the model like this: model = g.Doc2Vec.load(model_path.encode('utf-8')) But then I have this error: File "C:\Users\fanta\Desktop\gensim-develop\gensim\utils.py", line 311, in _adapt_by_suffix if fname.endswith('.gz') or fname.endswith('.bz2'): TypeError: endswith first arg must be bytes or a tuple of bytes, not str

What I must do solve this error? Thanks.

May 29 '18 19:05 TobiasEl

same issue..

Dec 14 '18 09:12 samrudh

I had the same problem, but if it's anything concerning bytes vs strings, it's usually an issue of Python2/3 compatibility - are you guys happening to use Python3? The error occurs when unpickling, so it makes sense to look at the changes in pickle-objects between Python 2 and 3, see for example https://blog.modest-destiny.com/posts/python-2-and-3-compatible-pickle-save-and-load/ - they describe the exact same error there. However unfortunately their fix also doesn't work, neither does any encoding - I thus reluctantly switched to py2 and it works there :)

Sep 30 '20 17:09 cstenkamp

Seems like this issue is coming because of the Python 2/3 compatibility. I was facing the same error and I could solve this issue by replacing return _pickle.loads(f.read()) with return _pickle.loads(f.read(), encoding='latin1') in the gensim/untils.py of the forked gensim https://github.com/jhlau/gensim

Jun 01 '22 23:06 parshin76

doc2vec doc2vec copied to clipboard

'ascii' codec can't decode byte 0xf7 in position 0: ordinal not in range(128)

doc2vec
doc2vec copied to clipboard