doc2vec
doc2vec copied to clipboard
'ascii' codec can't decode byte 0xf7 in position 0: ordinal not in range(128)
Hi, I'm using the gensim forked version, but when I'm loading the model I have this error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xf7 in position 0: ordinal not in range(128) I try to encode the name of the model like this: model = g.Doc2Vec.load(model_path.encode('utf-8')) But then I have this error: File "C:\Users\fanta\Desktop\gensim-develop\gensim\utils.py", line 311, in _adapt_by_suffix if fname.endswith('.gz') or fname.endswith('.bz2'): TypeError: endswith first arg must be bytes or a tuple of bytes, not str
What I must do solve this error? Thanks.
same issue..
I had the same problem, but if it's anything concerning bytes vs strings, it's usually an issue of Python2/3 compatibility - are you guys happening to use Python3? The error occurs when unpickling, so it makes sense to look at the changes in pickle-objects between Python 2 and 3, see for example https://blog.modest-destiny.com/posts/python-2-and-3-compatible-pickle-save-and-load/ - they describe the exact same error there. However unfortunately their fix also doesn't work, neither does any encoding - I thus reluctantly switched to py2 and it works there :)
Seems like this issue is coming because of the Python 2/3 compatibility.
I was facing the same error and I could solve this issue by replacing return _pickle.loads(f.read())
with return _pickle.loads(f.read(), encoding='latin1')
in the gensim/untils.py
of the forked gensim https://github.com/jhlau/gensim