Sentiment-Analysis icon indicating copy to clipboard operation
Sentiment-Analysis copied to clipboard

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Open shawnwang95 opened this issue 6 years ago • 0 comments

Prefix dict has been built succesfully. Traceback (most recent call last): File "predict.py", line 23, in lstm_predict(sentence) File "code/Sentiment_lstm.py", line 187, in lstm_predict data=input_transform(string) File "code/Sentiment_lstm.py", line 173, in input_transform model=gensim.models.Word2Vec.load_word2vec_format('lstm_data/Word2vec_model.pkl', binary = True, unicode_errors='ignore') File "/anaconda3/lib/python3.6/site-packages/gensim/models/word2vec.py", line 1172, in load_word2vec_format header = utils.to_unicode(fin.readline(), encoding=encoding) File "/anaconda3/lib/python3.6/site-packages/gensim/utils.py", line 217, in any2unicode return unicode(text, encoding, errors=errors) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Tried "model=gensim.models.Word2Vec.load_word2vec_format('lstm_data/Word2vec_model.pkl', unicode_errors='ignore')", still same error.

shawnwang95 avatar Jul 23 '18 09:07 shawnwang95