seq2vec
seq2vec copied to clipboard
can't load word2vec model when running example code
I am trying to execute the LSTM to LSTM auto-encoder with word embedding (RNN to RNN architecture) example. I have already trained my own word2vec model via gensim and saved it with the command
model.save('/home/estathop/Documents/word2vecmodel/w2v1model') #save model
when trying to use the
# load Gensim word2vec from word2vec_model_path
word2vec = GensimWord2vec('/home/estathop/Documents/word2vecmodel/w2v1model')
the following error occurs:
Traceback (most recent call last):
File "
", line 5, in word2vec = GensimWord2vec('/home/estathop/Documents/word2vecmodel/w2v1model') File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/word2vec/gensim_word2vec.py", line 9, in init model_path, binary=True
File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/models/keyedvectors.py", line 1120, in load_word2vec_format limit=limit, datatype=datatype)
File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/models/utils_any2vec.py", line 174, in _load_word2vec_format header = utils.to_unicode(fin.readline(), encoding=encoding)
File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/utils.py", line 359, in any2unicode return unicode(text, encoding, errors=errors)
File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte
any ideas how to fix/bypass this ?
try saving word2vec in the original binary format:
model = load_model(modelpath=modelpath)
model.wv.save_word2vec_format('w2v-original.bin', binary=True)
@bhavikm thanks, I bypassed the problem below but now another error occurs, when trying to execute the next block from the example, this error shows up:
transformer = Seq2VecR2RWord(
word2vec_model=word2vec,
max_length=20,
latent_size=300,
encoding_size=300,
learning_rate=0.05
)
Traceback (most recent call last):
File "
", line 6, in learning_rate=0.05 File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/model/seq2vec_R2R_word.py", line 55, in init learning_rate=learning_rate
File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/model/seq2vec_base.py", line 68, in init self.model, self.encoder = self.create_model()
File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/model/seq2vec_R2R_word.py", line 87, in create_model dense_dropout=0.
File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/yklz/recurrent/rnn_cell.py", line 22, in init **kwargs
File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/keras/engine/topology.py", line 262, in init self.stateful = False
AttributeError: can't set attribute
Please use python 3.5 or above.